You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@river.apache.org by Patricia Shanahan <pa...@acm.org> on 2010/12/03 08:27:15 UTC

Progress, and a problem

I'm currently hunting an intermittent bug found by the test 
qa/src/com/sun/jini/test/impl/outrigger/matching/StressTestWithShutdown.td

After a failure on Hudson, I modified the .td file to make it fail more 
often by increasing the number of entries (10,000), readers (1000), and 
writers (1000).

The writers write entries in an OutriggerServerImpl JavaSpace. The 
readers read, and then take, entries that the writers wrote. Sometimes, 
a reader fails to find an entry a writer claims to have written, causing 
a timeout.

The outrigger implementation depends on the class FastList which seems 
to use the infamous Double Checked Locking idiom 
(http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)

The good news is that any memory model related error in FastList, or the 
related class EntryHolder, would be a plausible cause of the observed 
symptom. The bad news is that FastList and EntryHolder seem to have been 
written to be very aggressively parallel, possibly by someone who was 
only familiar with sequentially consistent memory. :-(

Usually, it is easy to fix a problem once it has been located. This may 
be a bit more difficult, especially because I assume the parallelism is 
needed for acceptable JavaSpace performance.

Patricia

Re: Progress, and a problem

Posted by Dennis Reedy <de...@gmail.com>.

HazelCast provides an in memory data grid, but I'm pretty sure it has no implementation of JavaSpace or JavaSpace05, and from what I can tell does not include any Jini technology whatsoever.

On Dec 3, 2010, at 1109AM, Mike McGrady wrote:

> Do not forget HazelCast.  The spaces concept is on the upswing in my opinion.  
> 
> Sent from my iPhone
> 
> Michael McGrady
> Principal investigator AF081_028 SBIR
> Chief Architect
> Topia Technology, Inc
> Work 1.253.572.9712
> Cel 1.253.720.3365
> 
> On Dec 3, 2010, at 7:43 AM, Patricia Shanahan <pa...@acm.org> wrote:
> 
>> Gregg Wonderly wrote:
>>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or commercial versions.  I'm partially inclined to suggest that we should discuss EOL of outrigger at some point. Even though Javaspaces
>>> is a large part of what Jini has been recognized for, it has a focused audience and if we don't have someone with knowledge and interest to support outrigger, it may be more of a wart than River can deal with.
>> 
>> Although I have limited multi-processor Java and River experience, I do
>> have the right general background for that mission. I've got decades of
>> system performance experience, including finding bottlenecks in
>> multiprocessor operating systems, I understand memory models, and I have
>> the academic computer science education to look for and understand the
>> latest research on concurrent data structures.
>> 
>> On the other hand, if we are merely duplicating functionality that is
>> already available from other sources, that may not be the best use of my
>> River time.
>> 
>>> One of the issues that I've found in network intensive applications, is that the latency of communications is so huge compared to code paths, that all active threads will fairly quickly end up hovering on
>>> top of any use of "synchronized" so that there is always the worst case contention for such protected resources.
>> 
>> Communications latency is something that seriously worries me in the
>> current QA strategy, in which all components run on the same system. We
>> are not testing with the sort of timing and contention issues our real
>> world users will experience. There is a risk of not finding bugs that
>> only happen with timings induced by communications latency, as well as
>> not noticing performance regressions.
>> 
>> Patricia

Re: Progress, and a problem

Posted by Mike McGrady <mm...@topiatechnology.com>.

Do not forget HazelCast.  The spaces concept is on the upswing in my opinion.  

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 3, 2010, at 7:43 AM, Patricia Shanahan <pa...@acm.org> wrote:

> Gregg Wonderly wrote:
>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or commercial versions.  I'm partially inclined to suggest that we should discuss EOL of outrigger at some point. Even though Javaspaces
>> is a large part of what Jini has been recognized for, it has a focused audience and if we don't have someone with knowledge and interest to support outrigger, it may be more of a wart than River can deal with.
> 
> Although I have limited multi-processor Java and River experience, I do
> have the right general background for that mission. I've got decades of
> system performance experience, including finding bottlenecks in
> multiprocessor operating systems, I understand memory models, and I have
> the academic computer science education to look for and understand the
> latest research on concurrent data structures.
> 
> On the other hand, if we are merely duplicating functionality that is
> already available from other sources, that may not be the best use of my
> River time.
> 
>> One of the issues that I've found in network intensive applications, is that the latency of communications is so huge compared to code paths, that all active threads will fairly quickly end up hovering on
>> top of any use of "synchronized" so that there is always the worst case contention for such protected resources.
> 
> Communications latency is something that seriously worries me in the
> current QA strategy, in which all components run on the same system. We
> are not testing with the sort of timing and contention issues our real
> world users will experience. There is a risk of not finding bugs that
> only happen with timings induced by communications latency, as well as
> not noticing performance regressions.
> 
> Patricia

Re: Progress, and a problem

Posted by Tom Hobbs <tv...@googlemail.com>.

Dan has some interesting ideas on doing remote testing on a single
machine.  Some of them we could possibly adapt to our needs/use.

http://www.dancres.org/blitzblog/2010/11/09/remoting/

Obviously, there might/probably/definitely be some QA and/or River
refactoring that needs to doing before we can do any of the stuff he
suggests.  Personally, I'd be tempted to take a test we want to run
remotely, duplicate it, and refactor it to get what we want.  But
then, I'm always in favour of adding additional tests...


On Sat, Dec 4, 2010 at 2:51 AM, Peter Firmstone <ji...@zeus.net.au> wrote:
> Gregg Wonderly wrote:
>>
>> On 12/3/2010 9:43 AM, Patricia Shanahan wrote:
>>>
>>> Gregg Wonderly wrote:
>>>>
>>>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
>>>> commercial versions. I'm partially inclined to suggest that we should
>>>> discuss
>>>> EOL of outrigger at some point. Even though Javaspaces
>>>> is a large part of what Jini has been recognized for, it has a focused
>>>> audience and if we don't have someone with knowledge and interest to
>>>> support
>>>> outrigger, it may be more of a wart than River can deal with.
>>>
>>> Although I have limited multi-processor Java and River experience, I do
>>> have the right general background for that mission. I've got decades of
>>> system performance experience, including finding bottlenecks in
>>> multiprocessor operating systems, I understand memory models, and I have
>>> the academic computer science education to look for and understand the
>>> latest research on concurrent data structures.
>>
>> I do believe you can dive in and do the discovery needed if you have that
>> interest.
>>
>>> On the other hand, if we are merely duplicating functionality that is
>>> already available from other sources, that may not be the best use of my
>>> River time.
>>
>> The other JavaSpaces implementations are just that.  I did say I was
>> partially inclined to make this suggestion.  I'm not at all interested in
>> pushing outrigger out of the nest, really.  I just don't want anyone to
>> spend too much time on one facet of River that there are actually other
>> usable open source implementations of, if we don't have bandwidth to do it.
>>
>>>> One of the issues that I've found in network intensive applications, is
>>>> that
>>>> the latency of communications is so huge compared to code paths, that
>>>> all
>>>> active threads will fairly quickly end up hovering on
>>>> top of any use of "synchronized" so that there is always the worst case
>>>> contention for such protected resources.
>>>
>>> Communications latency is something that seriously worries me in the
>>> current QA strategy, in which all components run on the same system. We
>>> are not testing with the sort of timing and contention issues our real
>>> world users will experience. There is a risk of not finding bugs that
>>> only happen with timings induced by communications latency, as well as
>>> not noticing performance regressions.
>>
>> I'm glad we both have this worry.  It is important and until you see it
>> happening, it's hard to understand.  I've spent a lot of time at the shell
>> prompt typing ^\ digging through stack traces, chasing down contention
>> issues and its not my idea of a fun time compared to writing useful code.
>>
>> Gregg Wonderly
>>
>
> Good point, anyone got any ideas for spreading the qa tests over multiple
> computers in a network?
>
> Peter.
>
>

Re: Progress, and a problem

Posted by Peter Firmstone <ji...@zeus.net.au>.

Gregg Wonderly wrote:
> On 12/3/2010 9:43 AM, Patricia Shanahan wrote:
>> Gregg Wonderly wrote:
>>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
>>> commercial versions. I'm partially inclined to suggest that we 
>>> should discuss
>>> EOL of outrigger at some point. Even though Javaspaces
>>> is a large part of what Jini has been recognized for, it has a focused
>>> audience and if we don't have someone with knowledge and interest to 
>>> support
>>> outrigger, it may be more of a wart than River can deal with.
>>
>> Although I have limited multi-processor Java and River experience, I do
>> have the right general background for that mission. I've got decades of
>> system performance experience, including finding bottlenecks in
>> multiprocessor operating systems, I understand memory models, and I have
>> the academic computer science education to look for and understand the
>> latest research on concurrent data structures.
>
> I do believe you can dive in and do the discovery needed if you have 
> that interest.
>
>> On the other hand, if we are merely duplicating functionality that is
>> already available from other sources, that may not be the best use of my
>> River time.
>
> The other JavaSpaces implementations are just that.  I did say I was 
> partially inclined to make this suggestion.  I'm not at all interested 
> in pushing outrigger out of the nest, really.  I just don't want 
> anyone to spend too much time on one facet of River that there are 
> actually other usable open source implementations of, if we don't have 
> bandwidth to do it.
>
>>> One of the issues that I've found in network intensive applications, 
>>> is that
>>> the latency of communications is so huge compared to code paths, 
>>> that all
>>> active threads will fairly quickly end up hovering on
>>> top of any use of "synchronized" so that there is always the worst case
>>> contention for such protected resources.
>>
>> Communications latency is something that seriously worries me in the
>> current QA strategy, in which all components run on the same system. We
>> are not testing with the sort of timing and contention issues our real
>> world users will experience. There is a risk of not finding bugs that
>> only happen with timings induced by communications latency, as well as
>> not noticing performance regressions.
>
> I'm glad we both have this worry.  It is important and until you see 
> it happening, it's hard to understand.  I've spent a lot of time at 
> the shell prompt typing ^\ digging through stack traces, chasing down 
> contention issues and its not my idea of a fun time compared to 
> writing useful code.
>
> Gregg Wonderly
>

Good point, anyone got any ideas for spreading the qa tests over 
multiple computers in a network?

Peter.

Re: Progress, and a problem

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/3/2010 9:43 AM, Patricia Shanahan wrote:
> Gregg Wonderly wrote:
>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
>> commercial versions. I'm partially inclined to suggest that we should discuss
>> EOL of outrigger at some point. Even though Javaspaces
>> is a large part of what Jini has been recognized for, it has a focused
>> audience and if we don't have someone with knowledge and interest to support
>> outrigger, it may be more of a wart than River can deal with.
>
> Although I have limited multi-processor Java and River experience, I do
> have the right general background for that mission. I've got decades of
> system performance experience, including finding bottlenecks in
> multiprocessor operating systems, I understand memory models, and I have
> the academic computer science education to look for and understand the
> latest research on concurrent data structures.

I do believe you can dive in and do the discovery needed if you have that interest.

> On the other hand, if we are merely duplicating functionality that is
> already available from other sources, that may not be the best use of my
> River time.

The other JavaSpaces implementations are just that.  I did say I was partially 
inclined to make this suggestion.  I'm not at all interested in pushing 
outrigger out of the nest, really.  I just don't want anyone to spend too much 
time on one facet of River that there are actually other usable open source 
implementations of, if we don't have bandwidth to do it.

>> One of the issues that I've found in network intensive applications, is that
>> the latency of communications is so huge compared to code paths, that all
>> active threads will fairly quickly end up hovering on
>> top of any use of "synchronized" so that there is always the worst case
>> contention for such protected resources.
>
> Communications latency is something that seriously worries me in the
> current QA strategy, in which all components run on the same system. We
> are not testing with the sort of timing and contention issues our real
> world users will experience. There is a risk of not finding bugs that
> only happen with timings induced by communications latency, as well as
> not noticing performance regressions.

I'm glad we both have this worry.  It is important and until you see it 
happening, it's hard to understand.  I've spent a lot of time at the shell 
prompt typing ^\ digging through stack traces, chasing down contention issues 
and its not my idea of a fun time compared to writing useful code.

Gregg Wonderly

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

Gregg Wonderly wrote:
> Many people are using Dan Creswell's Blitz JavaSpaces implementation 
> or commercial versions.  I'm partially inclined to suggest that we 
> should discuss EOL of outrigger at some point. Even though Javaspaces
>  is a large part of what Jini has been recognized for, it has a 
> focused audience and if we don't have someone with knowledge and 
> interest to support outrigger, it may be more of a wart than River 
> can deal with.

Although I have limited multi-processor Java and River experience, I do
have the right general background for that mission. I've got decades of
system performance experience, including finding bottlenecks in
multiprocessor operating systems, I understand memory models, and I have
the academic computer science education to look for and understand the
latest research on concurrent data structures.

On the other hand, if we are merely duplicating functionality that is
already available from other sources, that may not be the best use of my
River time.

> One of the issues that I've found in network intensive applications, 
> is that the latency of communications is so huge compared to code 
> paths, that all active threads will fairly quickly end up hovering on
> top of any use of "synchronized" so that there is always the worst 
> case contention for such protected resources.

Communications latency is something that seriously worries me in the
current QA strategy, in which all components run on the same system. We
are not testing with the sort of timing and contention issues our real
world users will experience. There is a risk of not finding bugs that
only happen with timings induced by communications latency, as well as
not noticing performance regressions.

Patricia

Re: Progress, and a problem

Posted by Sim IJskes - QCG <si...@qcg.nl>.

On 13-12-10 16:27, Patricia Shanahan wrote:
> Sim IJskes - QCG wrote:
>> On 13-12-10 07:16, Patricia Shanahan wrote:
>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>> Patricia Shanahan wrote:
>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>> ...
>>>>> > The important issue in FastList is that it was written with the
>>>>> JDK1.4
>>>>> > memory model. After moving River to Java 1.5, we'd have the JSR166
>>>>> work
>>>>> > and the new, consistent memory model where volatile has a true
>>>>> meaning.
>>>>> > However, this code in particular is quite complex as you have
>>>>> noted, so
>>>>> > even adjusting to the new memory model could be problematic.
>>>>>
>>
>> I dont get it. What is the role of FastList. If it is only a very fast
>> concurrent single linked list, with an iterator, i can be implemented
>> much easier.
>
> As far as I can tell it is intended to be a concurrent linked list with
> add at tail, forwards iteration starting at head (not currently done as
> an Iterator, but I think it should be) and removal. I am not yet sure
> whether removal is always done in the context of an iteration or not.
>
>>
>> It i only has a iterator and add,remove operations, only the list
>> stitching needs to be guarded. As long as no garantuees are given in
>> timing relations.
>
> Having studied it closely for many hours, the code appears to me to be
> complicated out of all proportion to its function.

This scan is only meant to shorten the time to remove is it? Would a 
double-linked list, with the forward link used for iterator and the 
backward only to find the unstitch point not be a lot easier to build? 
Would a stitch of the 2 links back and forward separately guarded cost 
so much more time? I have to find some literature on this i think.

Gr. Sim

-- 
QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

Sim IJskes - QCG wrote:
> On 13-12-10 07:16, Patricia Shanahan wrote:
>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>> Patricia Shanahan wrote:
>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>> ...
>>>> > The important issue in FastList is that it was written with the 
>>>> JDK1.4
>>>> > memory model. After moving River to Java 1.5, we'd have the JSR166
>>>> work
>>>> > and the new, consistent memory model where volatile has a true
>>>> meaning.
>>>> > However, this code in particular is quite complex as you have
>>>> noted, so
>>>> > even adjusting to the new memory model could be problematic.
>>>>
> 
> I dont get it. What is the role of FastList. If it is only a very fast 
> concurrent single linked list, with an iterator, i can be implemented 
> much easier.

As far as I can tell it is intended to be a concurrent linked list with 
add at tail, forwards iteration starting at head (not currently done as 
an Iterator, but I think it should be) and removal. I am not yet sure 
whether removal is always done in the context of an iteration or not.

> 
> It i only has a iterator and add,remove operations, only the list 
> stitching needs to be guarded. As long as no garantuees are given in 
> timing relations.

Having studied it closely for many hours, the code appears to me to be 
complicated out of all proportion to its function.

> The WeakHashmap role, looks strange. It is used only for a isEmpty() 
> function, and i hope that this is only used for not to add an extra 
> reference. Because when it depends on removal by losing a reference 
> because it has gone weakly reachable, there is no garanteed minimal 
> timing for the isEmpty() = true to catch up. JDK6 has a much directer 
> relation between a reference going out of scope and going weakly then JDK4.

The WeakHashmap, I believe, is to do with a thread losing interest in a 
scan. As far as I can tell, the objective is to allow a node that has 
been marked as guard node for a thread to lose that property after the 
Thread object is garbage collected. Of course, that may be hours, days, 
weeks, or months after the last time the thread touched the FastList, 
but a simple AtomicInteger would never get to zero after an abandoned scan.

I believe the code should be optimized for abandoned scans. In the 
JavaSpaces implementation, finding the oldest item in the space matching 
a template turns into a FastList search that stops at the first node 
matching the template.

If the guard node business is really needed, it would be better done 
based on an Iterator that is implemented in a FastList member class. 
This might even be a case for a finalizer, though they have their own 
problems. The Iterator's finalizer would unguard the Iterator's guard 
node. An Iterator is much less likely than a Thread to remain reachable 
after the iteration has been abandoned. Alternatively, we could provide 
a close method that would pro-actively shut down the iteration, and 
unguard the guard node. FastList is package access only, so all uses are 
under our control.

I don't know whether the guard node implements required functionality or 
is an outgrowth of synchronization confusion. It marks a node that was 
at the tail of the list when the scan started, and stops the scan 
immediately after showing that node, even if more nodes have been added 
since then. Without the guard nodes, there would be no guarantee that a 
search would ever finish, as long as new nodes are being added at the 
end of the list.

Patricia

Re: Progress, and a problem

Posted by Sim IJskes - QCG <si...@qcg.nl>.

On 13-12-10 07:16, Patricia Shanahan wrote:
> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>> Patricia Shanahan wrote:
>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>> ...
>>> > The important issue in FastList is that it was written with the JDK1.4
>>> > memory model. After moving River to Java 1.5, we'd have the JSR166
>>> work
>>> > and the new, consistent memory model where volatile has a true
>>> meaning.
>>> > However, this code in particular is quite complex as you have
>>> noted, so
>>> > even adjusting to the new memory model could be problematic.
>>>

I dont get it. What is the role of FastList. If it is only a very fast 
concurrent single linked list, with an iterator, i can be implemented 
much easier.

It i only has a iterator and add,remove operations, only the list 
stitching needs to be guarded. As long as no garantuees are given in 
timing relations.

The WeakHashmap role, looks strange. It is used only for a isEmpty() 
function, and i hope that this is only used for not to add an extra 
reference. Because when it depends on removal by losing a reference 
because it has gone weakly reachable, there is no garanteed minimal 
timing for the isEmpty() = true to catch up. JDK6 has a much directer 
relation between a reference going out of scope and going weakly then JDK4.

Rambling?

Gr. Sim

-- 
QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

In fairness to the implementation, it is a tuned test, designed and  
tweaked to fail quickly. I started from a single failure of an  
outrigger stress test that almost always passes. This program has a  
couple of hundred threads hitting a FastList continuously with the mix  
of actions it would see, at a much lower rate and with fewer threads,  
in the QA test.

Yes, splitting the list should be considered.

Patricia

On Dec 13, 2010, at 10:34, Gregg Wonderly <gr...@wonderly.org> wrote:

> This does fail fairly quickly (immediately) on my windows laptop.
>
> I am not sure that I have time to really look over this code.  I  
> wonder if anyone knows if this is relatively new code that John put  
> together as part of the effort to remove the use of PSE from  
> outrigger, or is the original "non-persistent" javaspaces  
> implementation?
>
> Perhaps we need to do something different here, a segmented list for  
> example, which is what PSE did with it's Vector implementation so  
> that segments of the list could be locked independently, as well as  
> allowing the segments to be "deleted" from disk once they were  
> "empty".
>
> Gregg
>
> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>> Patricia Shanahan wrote:
>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>> ...
>>>> > The important issue in FastList is that it was written with the  
>>>> JDK1.4
>>>> > memory model. After moving River to Java 1.5, we'd have the  
>>>> JSR166 work
>>>> > and the new, consistent memory model where volatile has a true  
>>>> meaning.
>>>> > However, this code in particular is quite complex as you have  
>>>> noted, so
>>>> > even adjusting to the new memory model could be problematic.
>>>>
>>>> I've just run a modified, simplified version of my test with java
>>>> version "1.4.2_19" and an unmodified copy of FastList, and I  
>>>> still get
>>>> the NullPointerException. This changes my thinking a bit. I had  
>>>> been
>>>> working from the assumption that the issue was to do with the  
>>>> changes
>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>> possibility of a more basic bug that is independent of the memory  
>>>> model.
>>>>
>>>> If there is anyone with a FastList or Java memory model  
>>>> background who
>>>> would like to help, please reply. I would welcome another set of  
>>>> eyes
>>>> on the code, and a cross check on my conclusions so far about how
>>>> FastList is supposed to work. There seems to be a critical  
>>>> invariant
>>>> that gets broken, and once that happens we are on track to either a
>>>> NullPointerException or dropped items.
>>>>
>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a  
>>>> main
>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>> mixture of threads that repeatedly add items to a FastList and  
>>>> threads
>>>> that repeatedly remove the first item they can from the FastList.
>>>> Failures seem to require simultaneous adds and removes.
>>>>
>>>> If I don't nail this problem fairly soon, I may abandon the  
>>>> current,
>>>> rather complicated, code and switch to writing a concurrent high
>>>> performance FastList substitute for 1.5 or later.
>>>>
>>>> Patricia
>>>>
>>> I'll have a look tonight, no promises though ;)
>>
>> I'm attaching the simplified test application main program that can  
>> run, and
>> fail, under JRE 1.4, with no need for Junit.
>>
>> Here's what I think I know. First of all, I have found some dubious
>> synchronization situations. However, fixing all the things I have  
>> so far found
>> of that type has only reduced the failure rate, not eliminated  
>> failures. That
>> could be caused by changing timings without having any impact on  
>> the root cause.
>>
>> The key invariant relates to a thread that is doing a scan,  
>> starting with a call
>> to head() and proceeding through a series of calls to next() to  
>> examine nodes.
>> The head() call sets up a guard node for the thread that was the  
>> tail at some
>> point during the head call. The invariant is that the series of next 
>> () calls
>> will reach the guard node before finding a null next pointer,  
>> indicating the
>> actual tail.
>>
>> The remove call does not really remove anything, it merely marks  
>> the node
>> removed. Removed nodes are unlinked as a side effect of scans,  
>> during head and
>> next calls, but only if they are not guard nodes.
>>
>> There are additional complications in the restart and reap methods,  
>> but we can
>> ignore them for now - my test does not use them.
>>
>> Once a guard node is lost, the synchronization breaks down  
>> completely, because
>> e.g. insertion at tail is protected by synchronization on the  
>> FastList instance,
>> but unlinking of a removed node in the middle is protected, to the  
>> extent it is
>> protected at all, by synchronization on the FastList.Node instance  
>> that is being
>> removed.
>>
>> The commonest failure symptom is a scan reaching the null next  
>> pointer at the
>> end of the FIFO during head(), without first finding the guard node  
>> it just set
>> up. An alternative form of failure is loss of some entries - they  
>> get added, but
>> the remove threads never find them. The second symptom is  
>> predominant in the
>> JavaSpaces stress test that got me started on this. Messing up a  
>> next pointer
>> could cause either.
>>
>> Incidentally, I'm curious about why it has such a fragile system in  
>> which the
>> state of a scan is partly tracked by thread, when it seems like an  
>> obvious
>> candidate for the Iterator pattern. Callers do need to be able to  
>> find out if a
>> remove call succeeded or not (the node may have been removed by  
>> another thread),
>> but that could be done in an interface extending Iterator. The  
>> WeakHashMap in a
>> node that keeps track of the threads for which it is a guard would  
>> instead track
>> the Iterator. There would be no need for thread local storage, the  
>> same data
>> could be kept in the Iterator.
>>
>> Thanks for any time you can spend looking at this.
>>
>> Patricia
>>
>>
>
>

Re: datastructure classes

Posted by Gregg Wonderly <gr...@wonderly.org>.

One of the primary types of data structures that I'm thinking about is a skip 
list as defined by Bill Pugh's (http://www.cs.umd.edu/~pugh/) paper:

ftp://ftp.cs.umd.edu/pub/skipLists/skiplists.pdf

The upshot is that a skip list provides compartmentalization/segmentation of 
data which might provide the right segmentation for a persistence layer around 
the same code.

Gregg Wonderly

On 12/14/2010 2:41 AM, Patricia Shanahan wrote:
> On 12/13/2010 11:48 PM, Peter wrote:
>> Pats, I think James is talking about my classes. James, thanks for
>> the links, much appreciated, I hope you'll find time to participate
>> more often.
>>
>> I have the utmost faith in Pats ability.
>>
>> Cheers,
>>
>> Peter.
> ...
>>>>> I've got some concurrent utilities in pepe you may pinch /
>>>>> improve if you like:
>>>>>
>>>>> org.apache.river.impl.util.*
>>>>>
>>>>> ConcurrentCollections ConcurrentSoftMap
>>>>> ConcurrentWeakIdentityMap ConcurrentWeakMap
>>>>>
>>>>> ConcurrentCollections is a multi read single write lock based
>>>>> collection wrapper. It defensively copies for Iterators but
>>>>> still allows performing removals from the underlying
>>>>> collection.
>>>>>
>>>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>>>> ReferenceQueue to remove stale entries prior to every method
>>>>> call. Cheers,
>>>>>
>>>>> Peter.
>
> I'll take a look at the code, and mine it for ideas, but just based on
> the descriptions, I think I can do better for the specifics of FastList.
> The fact that it is tail insertion only creates some opportunities that
> would not be present in a more general data structure.
>
> In particular, I am currently thinking about moving the unlinking of
> removed entries from being done as a side effect of each scan to being a
> separate TaskManager task. Each FastList would have at most one thread
> unlinking at a time, which would make it much simpler. I believe I can
> do that, in the 1.5 memory model, with volatile next pointers, and have
> very simple, fast synchronization-free iterators without copying.
>
> One trick that may make it easier and faster is to never delete the last
> node, including possibly inserting a dummy node. A list that always has
> at least one node can be simpler than one that may be empty. Of course,
> the dummy node would be invisible to callers.
>
> I am seriously considering synchronizing the tail insertions, depending
> on what I find out about the relative performance of blocking and
> non-blocking for that type of job.
>
> Patricia
>

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/13/2010 11:48 PM, Peter wrote:
> Pats, I think James is talking about my classes. James, thanks for
> the links, much appreciated, I hope you'll find time to participate
> more often.
>
> I have the utmost faith in Pats ability.
>
> Cheers,
>
> Peter.
...
>>>> I've got some concurrent utilities in pepe you may pinch /
>>>> improve if you like:
>>>>
>>>> org.apache.river.impl.util.*
>>>>
>>>> ConcurrentCollections ConcurrentSoftMap
>>>> ConcurrentWeakIdentityMap ConcurrentWeakMap
>>>>
>>>> ConcurrentCollections is a multi read single write lock based
>>>> collection wrapper. It defensively copies for Iterators but
>>>> still allows performing removals from the underlying
>>>> collection.
>>>>
>>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>>> ReferenceQueue to remove stale entries prior to every method
>>>> call. Cheers,
>>>>
>>>> Peter.

I'll take a look at the code, and mine it for ideas, but just based on
the descriptions, I think I can do better for the specifics of FastList.
The fact that it is tail insertion only creates some opportunities that
would not be present in a more general data structure.

In particular, I am currently thinking about moving the unlinking of
removed entries from being done as a side effect of each scan to being a
separate TaskManager task. Each FastList would have at most one thread
unlinking at a time, which would make it much simpler. I believe I can
do that, in the 1.5 memory model, with volatile next pointers, and have
very simple, fast synchronization-free iterators without copying.

One trick that may make it easier and faster is to never delete the last
node, including possibly inserting a dummy node. A list that always has
at least one node can be simpler than one that may be empty. Of course,
the dummy node would be invisible to callers.

I am seriously considering synchronizing the tail insertions, depending
on what I find out about the relative performance of blocking and
non-blocking for that type of job.

Patricia

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

++

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 14, 2010, at 3:33 AM, Peter <ji...@zeus.net.au> wrote:

> 
> ----- Original message -----
>> On 12/13/2010 11:48 PM, Peter wrote:
>>> Pats, I think James is talking about my classes. James, thanks for the links,
>>> much appreciated, I hope you'll find time to participate more often.
>> ...
>> 
>> And I really should stop hijacking this thread.
> 
> 
> I think your posts are relevant, don't stop on my account.
> 
> Cheers, 
> 
> Peter.
> 
> Despite my preoccupation
>> with FastList, I do realize that in the long term the success of River
>> may depend on our general understanding of high performance data
>> structures for multiprocessor systems.
>> 
>> Patricia
>> 
>> 
>> 
>

Re: datastructure classes

Posted by Peter <ji...@zeus.net.au>.

----- Original message -----
> On 12/13/2010 11:48 PM, Peter wrote:
> > Pats, I think James is talking about my classes. James, thanks for the links,
> > much appreciated, I hope you'll find time to participate more often.
> ...
>
> And I really should stop hijacking this thread.


I think your posts are relevant, don't stop on my account.

Cheers, 

Peter.

 Despite my preoccupation
> with FastList, I do realize that in the long term the success of River
> may depend on our general understanding of high performance data
> structures for multiprocessor systems.
>
> Patricia
>
>
>

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

++

And I would add in relation to the unique opportunities provided by the deep architectural features of River including especially outrigger and rio.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 14, 2010, at 1:54 AM, Patricia Shanahan <pa...@acm.org> wrote:

> On 12/13/2010 11:48 PM, Peter wrote:
>> Pats, I think James is talking about my classes. James, thanks for the links, much appreciated, I hope you'll find time to participate more often.
> ...
> 
> And I really should stop hijacking this thread. Despite my preoccupation with FastList, I do realize that in the long term the success of River may depend on our general understanding of high performance data structures for multiprocessor systems.
> 
> Patricia
> 
> 
>

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/13/2010 11:48 PM, Peter wrote:
> Pats, I think James is talking about my classes. James, thanks for the links, much appreciated, I hope you'll find time to participate more often.
...

And I really should stop hijacking this thread. Despite my preoccupation 
with FastList, I do realize that in the long term the success of River 
may depend on our general understanding of high performance data 
structures for multiprocessor systems.

Patricia

Re: datastructure classes

Posted by Peter <ji...@zeus.net.au>.

Pats, I think James is talking about my classes. James, thanks for the links, much appreciated, I hope you'll find time to participate more often.

I have the utmost faith in Pats ability.

Cheers,

Peter.


----- Original message -----
> You make a good point that I should take a look around to see if I can
> find anything suitable that already exists, with appropriate licensing.
>
> That said, I believe the requirements for an outrigger FastList are:
>
> 1. Highly scalable performance. In particular, iterating over the list
> should be fast even when many threads are doing it. As far as I can
> tell, any JavaSpace lookup turns into iterating over a FastList
> representing items of acceptable classes, to see if any of the nodes
> match the other template requirements.
>
> 2. A limited set of features: Iterate over the list starting at the
> head, remove a node, add at tail.
>
> The two libraries each go in a different direction.
>
> Javolution is primarily a single thread speed play, with fast and
> predictable performance. It is what I would want if I were writing real
> time code. It does have some support for concurrency, but not what is
> needed for outrigger: "Fast collections (or maps) can be made
> thread-safe by marking them FastCollection#shared shared Having a shared
> collection (or map) means that it can be safely used without
> synchronization. It does not mean that a change made by a thread is
> automatically viewed by another thread (that would require synchronizing)."
>
> The Apache Commons collections are very feature rich, but seem to be
> primarily single thread, with the synchronizing decorator approach to
> concurrency.
>
> I will do some more web searching, but my current strongest contender
> for an existing class is java.util.concurrent.ConcurrentLinkedQueue.
> It's main limitation is slow remove, but that might be worked around by
> using a pair of queues, and periodically cleaning by copying from one
> queue to another. Also, java.util.concurrent contains a reference to an
> important paper, http://www.cs.rochester.edu/u/michael/PODC96.html, that
> gives me a starting point for finding the latest research on these issues.
>
> Patricia
>
>
>
>
> On 12/13/2010 4:30 PM, James Grahn wrote:
> > Because there've been a few concerns about memory models and concurrent
> > data structures, I thought I'd suggest a couple of resources. Library
> > usage is preferable to reinventing the wheel, where possible:
> >
> > First:
> > http://javolution.org/
> >
> > I haven't used it directly; I've only used a library which used it. But
> > it perhaps deserves a scan to see if it would meet our requirements (and
> > testing to ensure it's acceptable if it claims to).
> >
> > Also, it's BSD-licensed. I believe that's acceptable?
> >
> > Second:
> > http://commons.apache.org/collections/index.html
> >
> > I don't *think* it has what we need, but it's worth poking around in
> > before creating a new datastructure. Has the advantage of being part of
> > the Apache family.
> >
> > jamesG
> >
> > On 12/13/2010 5:24 PM, Peter Firmstone wrote:
> > > Gregg Wonderly wrote:
> > > > On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
> > > > > This does fail fairly quickly (immediately) on my windows laptop.
> > > > >
> > > > > I am not sure that I have time to really look over this code. I
> > > > > wonder if anyone
> > > > > knows if this is relatively new code that John put together as part
> > > > > of the
> > > > > effort to remove the use of PSE from outrigger, or is the original
> > > > > "non-persistent" javaspaces implementation?
> > > > >
> > > > > Perhaps we need to do something different here, a segmented list for
> > > > > example,
> > > > > which is what PSE did with it's Vector implementation so that
> > > > > segments of the
> > > > > list could be locked independently, as well as allowing the segments
> > > > > to be
> > > > > "deleted" from disk once they were "empty".
> > > >
> > > > And, of course if we pull out outrigger, as an application/service,
> > > > separate from the Jini part of river, we could just say that Outrigger
> > > > requires JDK1.5 so that we can move to a different concurrency
> > > > implementation if that is needed, using the new memory model.
> > >
> > > It seems like a lot of work to fix a package private implementation,
> > > already based on flawed assumptions (Patricia's done a great job
> > > debugging this, she's a real asset, I look forward to learning more
> > > debugging tips). I suspect we'll get a much better result starting from
> > > scratch, utilising Java 6.
> > >
> > > Since River is a Jini platform, why don't we start by creating an
> > > independent implementation of outrigger utilising any latest available
> > > java features. Not only will this produce a better implementation, that
> > > will be easier to support, but it might improve our understanding of
> > > what's required for a modular build as well.
> > >
> > > The existing outrigger implementation can remain as it is, but
> > > deprecated, left in place for legacy support.
> > >
> > > I've got some concurrent utilities in pepe you may pinch / improve if
> > > you like:
> > >
> > > org.apache.river.impl.util.*
> > >
> > > ConcurrentCollections
> > > ConcurrentSoftMap
> > > ConcurrentWeakIdentityMap
> > > ConcurrentWeakMap
> > >
> > > ConcurrentCollections is a multi read single write lock based collection
> > > wrapper. It defensively copies for Iterators but still allows performing
> > > removals from the underlying collection.
> > >
> > > The ConcurrentMap's are based on ConcurrentHashMap, using a
> > > ReferenceQueue to remove stale entries prior to every method call.
> > > Cheers,
> > >
> > > Peter.
> > >
> > > >
> > > > One of the things I did in griddle, was define a RemoteIterator which
> > > > allows a "get" operation to have a return value before anything is in
> > > > the space. A server side thread, then looks through the space for
> > > > matches and adds them to the return queue for the calling "get". This
> > > > allows server side contention to be managed to some degree because the
> > > > number of "searching" threads could be held to an appropriate minimum
> > > > (even one). The javaspaces API doesn't disallow such proxy
> > > > implementation and JavaSpaces05 iterator support starts to expose this
> > > > kind of thing more literally.
> > > >
> > > > Ultimately, I think a segmented list that looks like a
> > > > List<List<Entry>> would be a way to distribute locking for "get"
> > > > because iteration would be less likely to be on the same segment at
> > > > the same time.
> > > >
> > > > Insertion at the tail, is always a contentious issue for concurrent
> > > > lists. Sometimes you can just use a small ConcurrentHashMap to perform
> > > > adds until it gets to a particular size, and then turn its contents
> > > > into a List and add that list to the tail of the List<List<Entry>>.
> > > > You can choose then to decide when to do that movement by watching for
> > > > the other lists to be empty too, or when a traversal gets to the end
> > > > of what is in a list already.
> > > >
> > > > This keeps writers quite free to insert quickly and completely away
> > > > from the readers as well. Ordering presents most of the opportunities
> > > > for big time contention. Unordered (map), or more segmented (map of
> > > > list) construction relieves some of the contention if course, by
> > > > distributing it.
> > > >
> > > > Gregg
> > > >
> > > > > Gregg
> > > > >
> > > > > On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
> > > > > > On 12/12/2010 5:48 PM, Peter Firmstone wrote:
> > > > > > > Patricia Shanahan wrote:
> > > > > > > > On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
> > > > > > > > ...
> > > > > > > > > The important issue in FastList is that it was written with the
> > > > > > > > JDK1.4
> > > > > > > > > memory model. After moving River to Java 1.5, we'd have the
> > > > > > > > JSR166 work
> > > > > > > > > and the new, consistent memory model where volatile has a true
> > > > > > > > meaning.
> > > > > > > > > However, this code in particular is quite complex as you have
> > > > > > > > noted, so
> > > > > > > > > even adjusting to the new memory model could be problematic.
> > > > > > > >
> > > > > > > > I've just run a modified, simplified version of my test with java
> > > > > > > > version "1.4.2_19" and an unmodified copy of FastList, and I still
> > > > > > > > get
> > > > > > > > the NullPointerException. This changes my thinking a bit. I had
> > > > > > > > been working from the assumption that the issue was to do with the
> > > > > > > > changes
> > > > > > > > in memory model between 1.4 and 1.5. I now have to consider the
> > > > > > > > possibility of a more basic bug that is independent of the memory
> > > > > > > > model.
> > > > > > > >
> > > > > > > > If there is anyone with a FastList or Java memory model background
> > > > > > > > who
> > > > > > > > would like to help, please reply. I would welcome another set of
> > > > > > > > eyes
> > > > > > > > on the code, and a cross check on my conclusions so far about how
> > > > > > > > FastList is supposed to work. There seems to be a critical
> > > > > > > > invariant that gets broken, and once that happens we are on track
> > > > > > > > to either a NullPointerException or dropped items.
> > > > > > > >
> > > > > > > > I can supply my test as a unit test (JDK 1.6, Junit 4) and as a
> > > > > > > > main program (JDK 1.4 or later0. In both forms, all it does is
> > > > > > > > fire up a mixture of threads that repeatedly add items to a
> > > > > > > > FastList and threads
> > > > > > > > that repeatedly remove the first item they can from the FastList.
> > > > > > > > Failures seem to require simultaneous adds and removes.
> > > > > > > >
> > > > > > > > If I don't nail this problem fairly soon, I may abandon the
> > > > > > > > current, rather complicated, code and switch to writing a
> > > > > > > > concurrent high performance FastList substitute for 1.5 or later.
> > > > > > > >
> > > > > > > > Patricia
> > > > > > > >
> > > > > > > I'll have a look tonight, no promises though ;)
> > > > > >
> > > > > > I'm attaching the simplified test application main program that can
> > > > > > run, and
> > > > > > fail, under JRE 1.4, with no need for Junit.
> > > > > >
> > > > > > Here's what I think I know. First of all, I have found some dubious
> > > > > > synchronization situations. However, fixing all the things I have so
> > > > > > far found
> > > > > > of that type has only reduced the failure rate, not eliminated
> > > > > > failures. That
> > > > > > could be caused by changing timings without having any impact on the
> > > > > > root cause.
> > > > > >
> > > > > > The key invariant relates to a thread that is doing a scan, starting
> > > > > > with a call
> > > > > > to head() and proceeding through a series of calls to next() to
> > > > > > examine nodes.
> > > > > > The head() call sets up a guard node for the thread that was the
> > > > > > tail at some
> > > > > > point during the head call. The invariant is that the series of
> > > > > > next() calls
> > > > > > will reach the guard node before finding a null next pointer,
> > > > > > indicating the
> > > > > > actual tail.
> > > > > >
> > > > > > The remove call does not really remove anything, it merely marks the
> > > > > > node
> > > > > > removed. Removed nodes are unlinked as a side effect of scans,
> > > > > > during head and
> > > > > > next calls, but only if they are not guard nodes.
> > > > > >
> > > > > > There are additional complications in the restart and reap methods,
> > > > > > but we can
> > > > > > ignore them for now - my test does not use them.
> > > > > >
> > > > > > Once a guard node is lost, the synchronization breaks down
> > > > > > completely, because
> > > > > > e.g. insertion at tail is protected by synchronization on the
> > > > > > FastList instance,
> > > > > > but unlinking of a removed node in the middle is protected, to the
> > > > > > extent it is
> > > > > > protected at all, by synchronization on the FastList.Node instance
> > > > > > that is being
> > > > > > removed.
> > > > > >
> > > > > > The commonest failure symptom is a scan reaching the null next
> > > > > > pointer at the
> > > > > > end of the FIFO during head(), without first finding the guard node
> > > > > > it just set
> > > > > > up. An alternative form of failure is loss of some entries - they
> > > > > > get added, but
> > > > > > the remove threads never find them. The second symptom is
> > > > > > predominant in the
> > > > > > JavaSpaces stress test that got me started on this. Messing up a
> > > > > > next pointer
> > > > > > could cause either.
> > > > > >
> > > > > > Incidentally, I'm curious about why it has such a fragile system in
> > > > > > which the
> > > > > > state of a scan is partly tracked by thread, when it seems like an
> > > > > > obvious
> > > > > > candidate for the Iterator pattern. Callers do need to be able to
> > > > > > find out if a
> > > > > > remove call succeeded or not (the node may have been removed by
> > > > > > another thread),
> > > > > > but that could be done in an interface extending Iterator. The
> > > > > > WeakHashMap in a
> > > > > > node that keeps track of the threads for which it is a guard would
> > > > > > instead track
> > > > > > the Iterator. There would be no need for thread local storage, the
> > > > > > same data
> > > > > > could be kept in the Iterator.
> > > > > >
> > > > > > Thanks for any time you can spend looking at this.
> > > > > >
> > > > > > Patricia
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
>

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/13/2010 11:32 PM, MICHAEL MCGRADY wrote:
> Patricia
>
> http://www.cse.lehigh.edu/~spear/
> http://www.cs.rochester.edu/u/vmarathe/
> http://www.cs.rice.edu/~wns1/
> http://www.cs.rochester.edu/u/scott/papers/2003_FTDCS_IW.pdf
>
> A few other sources that are not so antiquated, which is not a negative.  I have asked Prof Scott to share his latest research.
>

Good. I'll read those. Also, I'm trying looking for research on the 
trade-off between blocking and non-blocking.

At the high level, the way I see it is that blocking has the 
disadvantage that preemption of a thread while it holds the lock may 
delay the rest of the threads until the first thread gets back on a 
processor and finishes the critical region. The advantage is that 
processors that are waiting do not use the processor-memory interconnect 
and do not get in the way.

I've seen logic analyzer traces of 64 processors all trying to increment 
the same atomic counter by one. The counter was implemented using 
compare-and-swap. It looked something like this:

64 processors all look at the counter, and see it is zero.
64 processors all try to compare-and-swap change it to one.
1 processor succeeds, and goes on to do something else. The other 63 all 
see the counter has changed, and try again.

63 processors all look at the counter, and see it is one.
...

The reality was not quite that tidy - some processors might not get to 
see the new value of the counter until it has been incremented a couple 
of times - but the point is that contention costs either way.

In something like outrigger, part of the solution may be to work on 
having the optimal number of threads, so that the processors are kept 
busy as long as there is work to do, but the threads are not tripping 
over each other.

Patricia

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

Patricia

http://www.cse.lehigh.edu/~spear/
http://www.cs.rochester.edu/u/vmarathe/
http://www.cs.rice.edu/~wns1/
http://www.cs.rochester.edu/u/scott/papers/2003_FTDCS_IW.pdf

A few other sources that are not so antiquated, which is not a negative.  I have asked Prof Scott to share his latest research.

MG

On Dec 13, 2010, at 5:15 PM, Patricia Shanahan wrote:

> You make a good point that I should take a look around to see if I can find anything suitable that already exists, with appropriate licensing.
> 
> That said, I believe the requirements for an outrigger FastList are:
> 
> 1. Highly scalable performance. In particular, iterating over the list should be fast even when many threads are doing it. As far as I can tell, any JavaSpace lookup turns into iterating over a FastList representing items of acceptable classes, to see if any of the nodes match the other template requirements.
> 
> 2. A limited set of features: Iterate over the list starting at the head, remove a node, add at tail.
> 
> The two libraries each go in a different direction.
> 
> Javolution is primarily a single thread speed play, with fast and predictable performance. It is what I would want if I were writing real time code. It does have some support for concurrency, but not what is needed for outrigger: "Fast collections (or maps) can be made thread-safe by marking them FastCollection#shared shared Having a shared collection (or map) means that it can be safely used without synchronization. It does not mean that a change made by a thread is automatically viewed by another thread (that would require synchronizing)."
> 
> The Apache Commons collections are very feature rich, but seem to be primarily single thread, with the synchronizing decorator approach to concurrency.
> 
> I will do some more web searching, but my current strongest contender for an existing class is java.util.concurrent.ConcurrentLinkedQueue. It's main limitation is slow remove, but that might be worked around by using a pair of queues, and periodically cleaning by copying from one queue to another. Also, java.util.concurrent contains a reference to an important paper, http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting point for finding the latest research on these issues.
> 
> Patricia
> 
> 
> 
> 
> On 12/13/2010 4:30 PM, James Grahn wrote:
>> Because there've been a few concerns about memory models and concurrent
>> data structures, I thought I'd suggest a couple of resources. Library
>> usage is preferable to reinventing the wheel, where possible:
>> 
>> First:
>> http://javolution.org/
>> 
>> I haven't used it directly; I've only used a library which used it. But
>> it perhaps deserves a scan to see if it would meet our requirements (and
>> testing to ensure it's acceptable if it claims to).
>> 
>> Also, it's BSD-licensed. I believe that's acceptable?
>> 
>> Second:
>> http://commons.apache.org/collections/index.html
>> 
>> I don't *think* it has what we need, but it's worth poking around in
>> before creating a new datastructure. Has the advantage of being part of
>> the Apache family.
>> 
>> jamesG
>> 
>> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>>> Gregg Wonderly wrote:
>>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>> 
>>>>> I am not sure that I have time to really look over this code. I
>>>>> wonder if anyone
>>>>> knows if this is relatively new code that John put together as part
>>>>> of the
>>>>> effort to remove the use of PSE from outrigger, or is the original
>>>>> "non-persistent" javaspaces implementation?
>>>>> 
>>>>> Perhaps we need to do something different here, a segmented list for
>>>>> example,
>>>>> which is what PSE did with it's Vector implementation so that
>>>>> segments of the
>>>>> list could be locked independently, as well as allowing the segments
>>>>> to be
>>>>> "deleted" from disk once they were "empty".
>>>> 
>>>> And, of course if we pull out outrigger, as an application/service,
>>>> separate from the Jini part of river, we could just say that Outrigger
>>>> requires JDK1.5 so that we can move to a different concurrency
>>>> implementation if that is needed, using the new memory model.
>>> 
>>> It seems like a lot of work to fix a package private implementation,
>>> already based on flawed assumptions (Patricia's done a great job
>>> debugging this, she's a real asset, I look forward to learning more
>>> debugging tips). I suspect we'll get a much better result starting from
>>> scratch, utilising Java 6.
>>> 
>>> Since River is a Jini platform, why don't we start by creating an
>>> independent implementation of outrigger utilising any latest available
>>> java features. Not only will this produce a better implementation, that
>>> will be easier to support, but it might improve our understanding of
>>> what's required for a modular build as well.
>>> 
>>> The existing outrigger implementation can remain as it is, but
>>> deprecated, left in place for legacy support.
>>> 
>>> I've got some concurrent utilities in pepe you may pinch / improve if
>>> you like:
>>> 
>>> org.apache.river.impl.util.*
>>> 
>>> ConcurrentCollections
>>> ConcurrentSoftMap
>>> ConcurrentWeakIdentityMap
>>> ConcurrentWeakMap
>>> 
>>> ConcurrentCollections is a multi read single write lock based collection
>>> wrapper. It defensively copies for Iterators but still allows performing
>>> removals from the underlying collection.
>>> 
>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>> ReferenceQueue to remove stale entries prior to every method call.
>>> Cheers,
>>> 
>>> Peter.
>>> 
>>>> 
>>>> One of the things I did in griddle, was define a RemoteIterator which
>>>> allows a "get" operation to have a return value before anything is in
>>>> the space. A server side thread, then looks through the space for
>>>> matches and adds them to the return queue for the calling "get". This
>>>> allows server side contention to be managed to some degree because the
>>>> number of "searching" threads could be held to an appropriate minimum
>>>> (even one). The javaspaces API doesn't disallow such proxy
>>>> implementation and JavaSpaces05 iterator support starts to expose this
>>>> kind of thing more literally.
>>>> 
>>>> Ultimately, I think a segmented list that looks like a
>>>> List<List<Entry>> would be a way to distribute locking for "get"
>>>> because iteration would be less likely to be on the same segment at
>>>> the same time.
>>>> 
>>>> Insertion at the tail, is always a contentious issue for concurrent
>>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>>> adds until it gets to a particular size, and then turn its contents
>>>> into a List and add that list to the tail of the List<List<Entry>>.
>>>> You can choose then to decide when to do that movement by watching for
>>>> the other lists to be empty too, or when a traversal gets to the end
>>>> of what is in a list already.
>>>> 
>>>> This keeps writers quite free to insert quickly and completely away
>>>> from the readers as well. Ordering presents most of the opportunities
>>>> for big time contention. Unordered (map), or more segmented (map of
>>>> list) construction relieves some of the contention if course, by
>>>> distributing it.
>>>> 
>>>> Gregg
>>>> 
>>>>> Gregg
>>>>> 
>>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>>> Patricia Shanahan wrote:
>>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>>> ...
>>>>>>>> > The important issue in FastList is that it was written with the
>>>>>>>> JDK1.4
>>>>>>>> > memory model. After moving River to Java 1.5, we'd have the
>>>>>>>> JSR166 work
>>>>>>>> > and the new, consistent memory model where volatile has a true
>>>>>>>> meaning.
>>>>>>>> > However, this code in particular is quite complex as you have
>>>>>>>> noted, so
>>>>>>>> > even adjusting to the new memory model could be problematic.
>>>>>>>> 
>>>>>>>> I've just run a modified, simplified version of my test with java
>>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>>>> get
>>>>>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>>>>>> working from the assumption that the issue was to do with the
>>>>>>>> changes
>>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>>>> model.
>>>>>>>> 
>>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>>> who
>>>>>>>> would like to help, please reply. I would welcome another set of
>>>>>>>> eyes
>>>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>>>> FastList is supposed to work. There seems to be a critical invariant
>>>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>>>> NullPointerException or dropped items.
>>>>>>>> 
>>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>>>>>> mixture of threads that repeatedly add items to a FastList and
>>>>>>>> threads
>>>>>>>> that repeatedly remove the first item they can from the FastList.
>>>>>>>> Failures seem to require simultaneous adds and removes.
>>>>>>>> 
>>>>>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>>>>>> rather complicated, code and switch to writing a concurrent high
>>>>>>>> performance FastList substitute for 1.5 or later.
>>>>>>>> 
>>>>>>>> Patricia
>>>>>>>> 
>>>>>>> I'll have a look tonight, no promises though ;)
>>>>>> 
>>>>>> I'm attaching the simplified test application main program that can
>>>>>> run, and
>>>>>> fail, under JRE 1.4, with no need for Junit.
>>>>>> 
>>>>>> Here's what I think I know. First of all, I have found some dubious
>>>>>> synchronization situations. However, fixing all the things I have so
>>>>>> far found
>>>>>> of that type has only reduced the failure rate, not eliminated
>>>>>> failures. That
>>>>>> could be caused by changing timings without having any impact on the
>>>>>> root cause.
>>>>>> 
>>>>>> The key invariant relates to a thread that is doing a scan, starting
>>>>>> with a call
>>>>>> to head() and proceeding through a series of calls to next() to
>>>>>> examine nodes.
>>>>>> The head() call sets up a guard node for the thread that was the
>>>>>> tail at some
>>>>>> point during the head call. The invariant is that the series of
>>>>>> next() calls
>>>>>> will reach the guard node before finding a null next pointer,
>>>>>> indicating the
>>>>>> actual tail.
>>>>>> 
>>>>>> The remove call does not really remove anything, it merely marks the
>>>>>> node
>>>>>> removed. Removed nodes are unlinked as a side effect of scans,
>>>>>> during head and
>>>>>> next calls, but only if they are not guard nodes.
>>>>>> 
>>>>>> There are additional complications in the restart and reap methods,
>>>>>> but we can
>>>>>> ignore them for now - my test does not use them.
>>>>>> 
>>>>>> Once a guard node is lost, the synchronization breaks down
>>>>>> completely, because
>>>>>> e.g. insertion at tail is protected by synchronization on the
>>>>>> FastList instance,
>>>>>> but unlinking of a removed node in the middle is protected, to the
>>>>>> extent it is
>>>>>> protected at all, by synchronization on the FastList.Node instance
>>>>>> that is being
>>>>>> removed.
>>>>>> 
>>>>>> The commonest failure symptom is a scan reaching the null next
>>>>>> pointer at the
>>>>>> end of the FIFO during head(), without first finding the guard node
>>>>>> it just set
>>>>>> up. An alternative form of failure is loss of some entries - they
>>>>>> get added, but
>>>>>> the remove threads never find them. The second symptom is
>>>>>> predominant in the
>>>>>> JavaSpaces stress test that got me started on this. Messing up a
>>>>>> next pointer
>>>>>> could cause either.
>>>>>> 
>>>>>> Incidentally, I'm curious about why it has such a fragile system in
>>>>>> which the
>>>>>> state of a scan is partly tracked by thread, when it seems like an
>>>>>> obvious
>>>>>> candidate for the Iterator pattern. Callers do need to be able to
>>>>>> find out if a
>>>>>> remove call succeeded or not (the node may have been removed by
>>>>>> another thread),
>>>>>> but that could be done in an interface extending Iterator. The
>>>>>> WeakHashMap in a
>>>>>> node that keeps track of the threads for which it is a guard would
>>>>>> instead track
>>>>>> the Iterator. There would be no need for thread local storage, the
>>>>>> same data
>>>>>> could be kept in the Iterator.
>>>>>> 
>>>>>> Thanks for any time you can spend looking at this.
>>>>>> 
>>>>>> Patricia
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

Sim,

It is not any of these things.  I think that both an in-memory system and a db system are options and important to provision, and a lot of other alternatives as well.  I do not know how you get to the touching Entry conjecture.  

We were talking about increased performance and about quality control criteria implicitly.  

I decided to put my two cents in and share what I knew from working with systems with very high demands: that is all there is to it, essentially.  My two cents involves my central expertise, architecting pieces of high performance, mission critical, systems.  The whole system needs to be looked at from my perspective prior to deciding on implementation details, otherwise an accidental architecture is the result.  We know how those work.  I am just sharing what I think and what the criteria (performance) is I need for our clients.  

Lately I am just talking about what I talked about!  ;-)  

I am not sure what you are thinking, Sim, but I hope I addressed what you are concerned about.

MG

On Dec 16, 2010, at 1:44 PM, Sim IJskes - QCG wrote:

> On 12/16/2010 09:48 PM, MICHAEL MCGRADY wrote:
>> mean we do not do databases.  Of course we do.  Doesn't everyone?
>> However, the primary data model and structures have to be in-memory
>> because we cannot tolerate the time database calls take (in-memory is
>> approximately 10,000 times faster).  I think that this is not only a
> 
> Is it, that you assume, that we would convert a in-memory based system into a db-only system? Would a persistence interface hook, add so much processing, that you can say in advance that you will not use it?
> 
> Is it, that you say, don't touch the implementation of Entry, or i will not use it?
> 
> Gr. Sim

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Sim IJskes - QCG <si...@qcg.nl>.

On 12/16/2010 09:48 PM, MICHAEL MCGRADY wrote:
> mean we do not do databases.  Of course we do.  Doesn't everyone?
> However, the primary data model and structures have to be in-memory
> because we cannot tolerate the time database calls take (in-memory is
> approximately 10,000 times faster).  I think that this is not only a

Is it, that you assume, that we would convert a in-memory based system 
into a db-only system? Would a persistence interface hook, add so much 
processing, that you can say in advance that you will not use it?

Is it, that you say, don't touch the implementation of Entry, or i will 
not use it?

Gr. Sim

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

If the system was linearly scalable and the result was fast enough, no.  For me, philosophy is dead and engineering lives.  I don't care about definitions and such, I care about results and that is what you mean.  I agree wholeheartedly.

MG

On Dec 16, 2010, at 7:58 PM, Patricia Shanahan wrote:

> On 12/16/2010 12:48 PM, MICHAEL MCGRADY wrote:
> ...
>> So, I think the non-functional requirements and related technologies,
>> e.g., clustering, in-memory access, etc., are primary.  For example,
>> when scaling is at issue, it is not important that 300,000,000
>> transactions can be handled in 10 hours, say, but the fact that we
>> can start at 100 transactions with economies of scale and then use
>> the system to scale to 300,000,000 with the same performance is.
> 
> Given a system that does 300,000,000 transactions with acceptable response time, would you really object if the 100 transaction system had even faster response time?
> 
> Patricia

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/16/2010 12:48 PM, MICHAEL MCGRADY wrote:
...
> So, I think the non-functional requirements and related technologies,
> e.g., clustering, in-memory access, etc., are primary.  For example,
> when scaling is at issue, it is not important that 300,000,000
> transactions can be handled in 10 hours, say, but the fact that we
> can start at 100 transactions with economies of scale and then use
> the system to scale to 300,000,000 with the same performance is.

Given a system that does 300,000,000 transactions with acceptable 
response time, would you really object if the 100 transaction system had 
even faster response time?

Patricia

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

Just a few thoughts below, which are advanced to add to the conversation, not detract from it.

On Dec 16, 2010, at 8:01 AM, Patricia Shanahan wrote:

> Mike McGrady wrote:
>> Intense network systems like our clients cannot work with database
>> calls.  They are too slow and do not scale.  They either reach a cost
>> or a performance ceiling.
> 
> I feel we are mixing requirements and implementation.

That may be correct, Patricia.  And the distinction is important.

On requirements, they can be seen as non-functional and functional.  The former drive the latter, I think we will all agree, but maybe not.  I definitely do not think that we can see if something will work and not consider initially whether or not even if it works it would be fairly useless except in ho-hum cases.  To ho-hum business users, that may be taken as a slight.  It is not.  I am just interested in because my clients are interested in high performance.

So, I think the non-functional requirements and related technologies, e.g., clustering, in-memory access, etc., are primary.  For example, when scaling is at issue, it is not important that 300,000,000 transactions can be handled in 10 hours, say, but the fact that we can start at 100 transactions with economies of scale and then use the system to scale to 300,000,000 with the same performance is.  This is particularly important in a cloud-based economy and using economies of scale in equipment, development time, etc.  That said, let me say a few things and I offer them not as written in stone but as something to think about and maybe agree on, or disagree on.

First, I should qualify this.  I am speaking off the top of my head and it appears I must be more guarded and/or more considerate of the time people have to read.  I am a toss it out and discuss it guy and some people do not have the time for that.  I believe in measure ten times and cut once, but people have different drivers.  I did not mean we do not do databases.  Of course we do.  Doesn't everyone?  However, the primary data model and structures have to be in-memory because we cannot tolerate the time database calls take (in-memory is approximately 10,000 times faster).  I think that this is not only a requirement for me and is not sui generis but is really a dominant part of the industry that will consider using Outrigger.  Also, unless we want to give up and succumb to Brewer's Theorem, a multi-tiered architecture with asynchronous writes to a database with eventual consistency will not do. This does not mean that Outrigger must not write to a database.  It must, of course, but there are other considerations, as we all know.

Second, that said, scaling as a requirement is roughly the capacity to continue to add stressors on the system (users, connections, memory, cpu cycles) and have the performance be the same, or equal.  If the performance drops, that does not scale, in this definition, which is the one I am used to.  What scaling does not mean to me is the ability to handle a given humongous load, humongous given stressors.

Third, I am not sure at this point where we want to go with Outrigger.  I am only interested and only have time to build a system that will not knock on the door but rather will knock the door down.  My clients expect no less.  And, ultimately I am not here for academic interests.  Others might have different interests.  I respect that of course.  I am just stating what I need.

Fourth, I tend to work as primarily an architect and a designer of system-of-systems off the non-functional values, the QCC or the "ilities".  This is, for my perspective, the goal and functional requirements in my mind are defined by and flow from these non-functional requirements.  So, I might have a different perspective but ultimately we would agree, I think, on the details.

> Response time, throughput, ACID properties and the like are requirements. What Outrigger uses as persistent storage for tuple space is implementation.
> 
> I would like to get more data on your performance requirements - specifics about e.g. x% of JavaSpace reads under these conditions must complete within y seconds.

I can answer this in a few days but it really would be as fast as possible until other values are compromised.  Certainly I can say that I work in complicated and critical systems and have to use as little time in my piece as possible.  When a radar has to be read, sign on security passed, multilevel security ensured, upgraded or downgraded structural data handled, integration logic decoupled, whether at the application or virtual machine basis, etc., and get to a screen in 1.0 seconds, the more invisible we are the better, except in some respects.  ;-)

> 
> The most useful and specific way of presenting requirements would be a scalable benchmark - something where we can simulate a larger number of machines doing real work by having a smaller number dedicated to driving transactions at a single JavaSpace and measuring the results.

I do not know if you are covering this in what you say, you might be and you probably are, but do you cover the case where the replication of the data is transactional?  Also, could you expand on this.  I think you have a great idea here but I am not sure that you have expressed the whole of it.

> 
> Measurements from a small to medium sized environment would be useful input for estimating performance at higher loads. Without that sort of data, I cannot even guess whether an achievable Outrigger implementation will be able to meet all your requirements.

I know that either an asynchronous write (consistency) or a synchronous write (performance) to a database will not work if that is the end-all and be-all for a number of reasons..  If you want me to quantify that, I can, but the fact that we can spend millions and do it will not be adequate.  The cost of machines and development time is important, crucial.  Scaling is necessary to achieve a consistent cost between high and low levels of stressors to a system, especially in this cloud-based economies of scale age.

> 
> Patricia

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

I have no problem with this as a base for discussing the long term issues.

In the short term, there is another Jira I need to write - reporting a 
bug in Outrigger. I got into this because of an intermittent QA test 
failure on Hudson. That needs to be fixed, either by finding out what is 
specifically wrong in the existing FastList implementation, other than 
being too complicated, or writing a new, simpler, more maintainable 
FastList.

Patricia


Tom Hobbs wrote:
> I've taken the liberty of creating a new Jira for the refactoring of
> FastList in the Outrigger internals.
> 
> https://issues.apache.org/jira/browse/RIVER-386
> 
> Sorry is that's presumptuous or if the wording of ticket is wrong.  I
> thought it'd be good to start getting something a bit more structured
> down.  Particularly if we can get the details of the kinds of numbers
> that Mike is talking about.
> 
> Personally, I think that Mike's (and others) comments are exactly what
> this dev community needs to see.  Any implementation decision we make
> (particular when refactoring existing functionality) needs to have (at
> least) an eye on the needs of our users.
> 
> On Thu, Dec 16, 2010 at 4:01 PM, Patricia Shanahan <pa...@acm.org> wrote:
>> Mike McGrady wrote:
>>> Intense network systems like our clients cannot work with database
>>> calls.  They are too slow and do not scale.  They either reach a cost
>>> or a performance ceiling.
>> I feel we are mixing requirements and implementation. Response time,
>> throughput, ACID properties and the like are requirements. What Outrigger
>> uses as persistent storage for tuple space is implementation.
>>
>> I would like to get more data on your performance requirements - specifics
>> about e.g. x% of JavaSpace reads under these conditions must complete within
>> y seconds.
>>
>> The most useful and specific way of presenting requirements would be a
>> scalable benchmark - something where we can simulate a larger number of
>> machines doing real work by having a smaller number dedicated to driving
>> transactions at a single JavaSpace and measuring the results.
>>
>> Measurements from a small to medium sized environment would be useful input
>> for estimating performance at higher loads. Without that sort of data, I
>> cannot even guess whether an achievable Outrigger implementation will be
>> able to meet all your requirements.
>>
>> Patricia
>>
>

Re: datastructure classes

Posted by Tom Hobbs <tv...@googlemail.com>.

I've taken the liberty of creating a new Jira for the refactoring of
FastList in the Outrigger internals.

https://issues.apache.org/jira/browse/RIVER-386

Sorry is that's presumptuous or if the wording of ticket is wrong.  I
thought it'd be good to start getting something a bit more structured
down.  Particularly if we can get the details of the kinds of numbers
that Mike is talking about.

Personally, I think that Mike's (and others) comments are exactly what
this dev community needs to see.  Any implementation decision we make
(particular when refactoring existing functionality) needs to have (at
least) an eye on the needs of our users.

On Thu, Dec 16, 2010 at 4:01 PM, Patricia Shanahan <pa...@acm.org> wrote:
> Mike McGrady wrote:
>>
>> Intense network systems like our clients cannot work with database
>> calls.  They are too slow and do not scale.  They either reach a cost
>> or a performance ceiling.
>
> I feel we are mixing requirements and implementation. Response time,
> throughput, ACID properties and the like are requirements. What Outrigger
> uses as persistent storage for tuple space is implementation.
>
> I would like to get more data on your performance requirements - specifics
> about e.g. x% of JavaSpace reads under these conditions must complete within
> y seconds.
>
> The most useful and specific way of presenting requirements would be a
> scalable benchmark - something where we can simulate a larger number of
> machines doing real work by having a smaller number dedicated to driving
> transactions at a single JavaSpace and measuring the results.
>
> Measurements from a small to medium sized environment would be useful input
> for estimating performance at higher loads. Without that sort of data, I
> cannot even guess whether an achievable Outrigger implementation will be
> able to meet all your requirements.
>
> Patricia
>

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

Mike McGrady wrote:
> Intense network systems like our clients cannot work with database
> calls.  They are too slow and do not scale.  They either reach a cost
> or a performance ceiling.

I feel we are mixing requirements and implementation. Response time, 
throughput, ACID properties and the like are requirements. What 
Outrigger uses as persistent storage for tuple space is implementation.

I would like to get more data on your performance requirements - 
specifics about e.g. x% of JavaSpace reads under these conditions must 
complete within y seconds.

The most useful and specific way of presenting requirements would be a 
scalable benchmark - something where we can simulate a larger number of 
machines doing real work by having a smaller number dedicated to driving 
transactions at a single JavaSpace and measuring the results.

Measurements from a small to medium sized environment would be useful 
input for estimating performance at higher loads. Without that sort of 
data, I cannot even guess whether an achievable Outrigger implementation 
will be able to meet all your requirements.

Patricia

Re: datastructure classes

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/16/2010 3:02 PM, Wade Chandler wrote:
> Now, the one place that doesn't seem a waste of time is using those unmarshalled
> Entries and fields for lookup. Seems having those realized in memory already is
> much faster. A way to speed up the overall effort would be to store the
> serialized form along with the realized/unmarshalled form in the list or what
> ever data structure since these held instances are immutable per the
> write/read/take operations; no reserialization. Then the serialized form can be
> returned immediately. More memory of course, but that could be limited with a
> logical archiving mechanism for memory which hasn't been accessed for a period
> of time (paging essentially of the serialized form). A hash, depending on size,
> would of course work for that whether on disk or in memory.
>
> Dwelling further, it doesn't seem probable to just take the Entry fields and
> serialize them associate them with their hash (hashCode) and then use something
> to compare those things because there is a possibility that hashCode hasn't been
> overridden and thus equals and hashCode won't necessarily both represent
> equality with the same meaning. Were there some guarantee that is the case, then
> that would be possible to do.

This is the type of thought that I gave when deciding to split keys and values 
in griddle, and to wrap them into an object which has a "matches" implementation 
along with equals and hashCode.  This allows "canned" matching to be in the 
classpath of the server, so no "downloaded code" for that, as well as allow 
unmarshalling of keys, which are intended to only ever be "native" types, or 
types already in the classpath of the server.

Gregg Wonderly

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/17/2010 10:22 AM, Gregg Wonderly wrote:
> On 12/16/2010 9:35 PM, Patricia Shanahan wrote:
>> I would love to be able to base it on a ConcurrentMap, but I don't see
>> how it
>> would work.
>>
>> As far as I can tell, there is no clean split between key and non-key
>> among the
>> entry's public fields. A template can specify values for any subset of
>> the
>> fields, and different templates used in reading from the same space
>> may choose
>> different sets of fields.
>>
>> Any ideas how to work around that?
>
> One of the things I did in Griddle was to create a CHM for each key
> field name. I then put every entry into the appropriate map based on it
> having a key value pair with a name.
>
> Matching is then a matter of querying each map with the provided key's
> value, and only if the same entry exists in all tables does a match
> begin processing.
...

Yes, I see what you mean. I've used similar techniques with 
java.util.HashMap to deal with multiple entries for the same key.

It raises a question about Outrigger functionality. As currently 
implemented, it is FIFO. For example, a single Entry read gets the 
oldest Entry in the space that matches the template.

The net.jini.space.JavaSpace interface does not require FIFO behavior. 
It says things like "Read any matching entry from the space, blocking 
until one exists."

In  a CHM-based implementation keeping the FIFO behavior might involve a 
sort taking O(k log k) time, where k is the number of elements that match.

Or k may be the number of entries in the smallest contents for any match 
field. One possible sequence is find the matching entries for the most 
restrictive field, sort them in time order, and search the sorted result 
for the first Entry that meets the rest of the template rules.

Any thoughts on this.

Patricia

Re: datastructure classes

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/16/2010 9:35 PM, Patricia Shanahan wrote:
> I would love to be able to base it on a ConcurrentMap, but I don't see how it
> would work.
>
> As far as I can tell, there is no clean split between key and non-key among the
> entry's public fields. A template can specify values for any subset of the
> fields, and different templates used in reading from the same space may choose
> different sets of fields.
>
> Any ideas how to work around that?

One of the things I did in Griddle was to create a CHM for each key field name. 
  I then put every entry into the appropriate map based on it having a key value 
pair with a name.

Matching is then a matter of querying each map with the provided key's value, 
and only if the same entry exists in all tables does a match begin processing.

key->map->put(entry,entry).

Thus an entry is keyed based on its uniqueness as an object.  There is not, 
currently, a list of objects, but instead a map so that remove is easy, and does 
not, in general, interfere with reads.

This, ultimately, could result in items not being processed in a high bandwidth 
environment due to how iteratation might work out.  So, I need to do something 
different like use a skip-list or some other queue like mechanism which will 
make sure entries eventually are processed.  A subtle aside, is that since the 
objects are the keys themselves, that to some degree there is an eventual 
completion possible because at some point the "hash" will be ordered to expose 
all entries.

> We can't just use a separate ConcurrentMap for each field, and take the
> intersection of results, because a map requires unique keys. A JavaSpace may
> have many entries with the same value of a given field.

The map described above is about compartmentalization of the entries into "sets" 
that would be most likely to then be matched.  In Outrigger, this might get you 
to a FastList of entries that all expose a value for the same key.  You would 
then need to scan them for the matching value.  From a scale perspective, the 
number of "in transit" objects impacts linear scanning ETA the most.  Heavily 
loaded spaces (ones with slower transaction/transition speeds) will have more 
latency added to the overall throughput of a single operation.

Javaspaces doesn't unmarshal entry data to avoid code downloading.  It demands 
keys be Object types so that a MarshalledInstance can be created in the client 
when the EntryRep is created.  Then, the comparison is about the Entry 
serializing to the same form as expressed in the EntryRep internally.

EntryRep's MarshalledInstance would be exactly the right key to use for further 
compartmentalization to, for example put a list of EntryRep objects with exactly 
the same value.  The simple side issue is that EntryRep doesn't store field 
names.  It only stores values in a array which is ordered based on the order of 
the fields returned from reflection.  So some work would be needed to actually 
have the name of the field in the server.

Let me talk a little bit more about Griddle's TemplateEntry interface shown here:

public interface TemplateEntry<K extends Serializable> extends Serializable {
	// Called with the entry to see if you match.  The called entry will
	// in general be a 'wider' view of the match space than the passed
	// entry.  I.e. ent may have getKeys() return 10 keys, while the
	// called entry may be looking for matches with 3 keys.
	public boolean matches( TemplateEntry<K> ent );
	public List<? extends K> getKeys();
}

This is everything about what Griddle uses for storing and matching.  It leaves 
it up to the implementation to manage the marshal/unmarshal of the entry's 
value.  The matches() method is what is used to see if an entry matches another. 
  When you create a query, you need to decide what getKeys() returns because 
that controls which entries matches() is called with.

So, for example, if you have a basic value object already in your codebase, 
you'd tack on behavior to be a TemplateEntry by adding an interface, but with 
more knowledge of the implementation you'd perhaps not need the interface.

public interface KeyProvider<K,T> {
	public List<K>allKeyNames();
	public Map<K,T> valueMap();
}

public class MyBackingObject implements KeyProvider<String,Object> {
	// This is the object that you want to fly around between clients using
	// the space (griddle)
}

Implementation of the Interface would expose the things that you'd want to be 
considered for matching in the space (griddle).

public class MyTemplateEntry<T extends MyBackingObject>
		 implements TemplateEntry<String> {

	// we marshal the value to carry it to the space.
	private MarshalledObject<T> val;
	// we provide a set of String based key names
	private List<String> keys;
	// We provide the values for those keys as
	// Serializable values.  You do not want to
	// use "random" types for values, only something
	// that the server will have no problem having a
	// class definition for without downloading code, unless
	// that is what you need, and then you just have to
	// think about the impact that has on the server's
	// lifecycle if you need to change the definition
	// of that class.
	private Map<String,Serializable> keyVals;

	public MyTemplateEntry( T value ) {
		val = new MarshalledObject<T>( value );
		// these are separated so that null values
		// can be part of what matches.  Otherwise
		// keys could just be valueMap().keySet().
		keys = value.allKeyNames();
		keyVals = value.valueMap();
	}

	public boolean matches( TemplateEntry<String> ent ) {
		// A more concrete type would not require this
		// check.
		if( ent instanceof MyTemplateEntry == false )
			return false;
		MyTemplateEntry<T extends MyBackingObject> mte =
			(MyTemplateEntry<T extends MyBackingObject>)ent;
		// get the keys for the entry that we are going to see
		// if we match.
		List<String>tkeys = ent.getKeys();
		for( String k : keys ) {
			// we need to find that the passed entry provides
			// a value for the same key.
			boolean found = false;
			for( String tk : tkeys ) {
				if( k.equals(tk) ) {
					// we want to check this value.
					// (need better logic here if we
					// want null == null to work.
					if( keyVals.get(k).equals(
						mte.keyVals.get(k) ) {
						found = true;
					}
				}
			}
			if( !found ) {
				// this entry does not provide a matching
				// value for the key we have.
				return false;
			}
		}
	}
	return true;
}

With all of that in place, a query might then either be a subclass, or it might 
just need to be created using a separate constructor of the base entry class.

public class OwnerMyTemplateEntryQuery extends
		 MyTemplateEntry<KeyProvider<String,Object>> {
	public List<String> getKeys() {
		return Arrays.asList( "first", "last" );
	}
}

The end result I was aiming for, was unlimited matching logic, 
minimization/elimination of downloaded code being possible, and separation of 
keys and the "value" so that anything serializable could be carried around as 
the value, and key/value pairs could be derived or otherwise managed separately.

With outrigger, I think that some of the "not visible" data could be more 
readily included in the EntryRep to make some of the activities of the 
storage/matching easier to do while eliminating some contention by further 
compartmentalization of the data.

Gregg Wonderly

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/16/2010 1:15 PM, Jeff Ramsdale wrote:
> In response to Wade's HashMap comments...
>
> I've wondered for a while if a JavaSpaces implementation could be
> built on ConcurrentMap. If so, and if the implementation were
> pluggable at runtime, one could provide Infinispan
> (http://www.jboss.org/infinispan/) caches (which implement
> ConcurrentMap) to serve as the store for the JavaSpace. Persistence,
> and/or clustering could then be left to Infinispan to handle.
>
> There could be any number of pitfalls here, but I've wondered if it
> might provide a flexible and powerful platform for scaling Outrigger.

I would love to be able to base it on a ConcurrentMap, but I don't see 
how it would work.

As far as I can tell, there is no clean split between key and non-key 
among the entry's public fields. A template can specify values for any 
subset of the fields, and different templates used in reading from the 
same space may choose different sets of fields.

Any ideas how to work around that?

We can't just use a separate ConcurrentMap for each field, and take the 
intersection of results, because a map requires unique keys. A JavaSpace 
may have many entries with the same value of a given field.

Patricia

Re: datastructure classes

Posted by Jeff Ramsdale <je...@gmail.com>.

In response to Wade's HashMap comments...

I've wondered for a while if a JavaSpaces implementation could be
built on ConcurrentMap. If so, and if the implementation were
pluggable at runtime, one could provide Infinispan
(http://www.jboss.org/infinispan/) caches (which implement
ConcurrentMap) to serve as the store for the JavaSpace. Persistence,
and/or clustering could then be left to Infinispan to handle.

There could be any number of pitfalls here, but I've wondered if it
might provide a flexible and powerful platform for scaling Outrigger.

-jeff

Re: datastructure classes

Posted by Wade Chandler <hw...@yahoo.com>.

Yes, unless an in-memory and local DB, then you have another layer of network 
abstraction plus the conversion factor on lookup unless talking about some kind 
of an OO DB. But, when needing to scale to a number of instances, then it may 
very well be unavoidable to go to disk. Also, without such a scheme (some form 
of persistence) there is no way to have long term fail safe persistence which I 
believe is a general requirement for a distributed system, and one which it 
seems JavaSpaces perhaps implies since it uses transactions. What happens to the 
distributed system when the shared memory goes up and down even after a 
transaction has completed successfully and not take operation has occurred?

part thinking out loud, and part solution...perhaps anyways

I'm thinking if some things are done a bit differently, then perhaps certain 
patterns do lend to a DB perhaps or better to a hashmap system. Though I think 
the issue with the DB, extra network, etc, are still an issue which a local file 
or memory system works around.

Thinking about the conversion a bit, and considering JavaSpaces implementations 
do not run or access logic in the instances they are housing, it really seems a 
waste of time to serialize to the Java serialized format and then instantiate 
those same instances to place them into some list, hash, other, etc. The type of 
entry and its fields are what are used to find something and per equality.

Some in memory implementation, in-memory from the client perspective where the 
whole of JavaSpaces is never transmitted over the air or wire, could just skip 
serialization all the way around perhaps ...except for the issue with it being a 
copy... Serialization is probably always a good idea in that case; of course 
depending on the requirements, but for some default implementation for sure.

Now, the one place that doesn't seem a waste of time is using those unmarshalled 
Entries and fields for lookup. Seems having those realized in memory already is 
much faster. A way to speed up the overall effort would be to store the 
serialized form along with the realized/unmarshalled form in the list or what 
ever data structure since these held instances are immutable per the 
write/read/take operations; no reserialization. Then the serialized form can be 
returned immediately. More memory of course, but that could be limited with a 
logical archiving mechanism for memory which hasn't been accessed for a period 
of time (paging essentially of the serialized form). A hash, depending on size, 
would of course work for that whether on disk or in memory.

Dwelling further, it doesn't seem probable to just take the Entry fields and 
serialize them associate them with their hash (hashCode) and then use something 
to compare those things because there is a possibility that hashCode hasn't been 
overridden and thus equals and hashCode won't necessarily both represent 
equality with the same meaning. Were there some guarantee that is the case, then 
that would be possible to do. 

However, and contrary to what I just wrote, since here I'm not talking about 
Object.hashCode and equals, it may be enough to just serialize each field of an 
Entry and take the whole (serialized field that is) and store it as a hash for 
comparison while the serialized form of the overall Entry can be stored for 
reads/takes; no reserialization needed this way when sending an object back to a 
caller. Perhaps those serialized fields can be taken from the Java serialization 
format directly. The work flow of a distributed spaces implementation may have 
the following work flow:

write:
1) A JavaSpaces service is looked up.
2) An Entry is written to the space.
3) The client impl serializes the object with regular Java serialization and 
sends it over the wire/air
4) The server impl takes the serialized instance and scans it; while scanning 
the fields of the Entry (those matching spec requirements of course) and while 
streaming them out is generating a hash or checksum for them. In other words, 
checks the format while at the same time generating hashes which can be used to 
check for equality since the specification states only equality will be used for 
lookup.
5) The serialized form is stored in memory or to disk along with information 
about the types this class extends or implements
6) The field names and hashes are stored and reference some key such as the hash 
for the overall serialized form which is used in a hash map for memory or a 
filename if on disk. Equal hashes should be OK if a reference counter is used to 
save memory and keep from storing the exact same Entries in more than one file 
if write is used more than once.
7) The serialized form is stored along with information about the types this 
class extends or implements

read/take (will leave out removal part of take here...simplest part):
1) A JavaSpaces service is looked up.
2) An Entry is requested with a template
3) The client impl serializes the template and sends it
4) The server impl performs roughly the same operations as 4 of course ignoring 
the null fields of the Entry (wild cards)
5) The generated hashes are used to look at only the required types and fields
6) The first match is returned

That seems like a good base of something which can start to incorporate 
clustering for the contained data as well as support disk or memory based 
contents. A single data structure would not support everything needed to be 
accomplished perhaps, but locking could be reduced to a couple locations I 
believe if memory is segmented into multiple containers. For disk, a structure 
versus a single file could work well.

For instance, the above lends itself to hashmaps. HashMaps can hold their 
contents in memory or in regular random access files since we are just talking 
buckets. The files can hold the buckets where the keys and base file names can 
be held. The key is to having segments and those being of a good size to not by 
default consume large amounts of memory yet provide as few steps as possible for 
a system which is some where in between. By using a hash, versus just an array 
or list, we can move around information per its key or hash, and this same key 
can be placed into any given segment and still function just the same if the 
size of the hashmaps are controlled and set to always be the same size. Too, 
locking can then be done away with on the hashmap segments all together since we 
won't actually resize and the hash and buckets don't need to be recalculated.

Now, there may be times when memory is needed to be regained from unused maps, 
so some locking would be needed for that, but the time to lookup a value would 
be drastically reduced, and for the most part less the field comparisons a 
lookup would take 1n where n is the number of hashmaps which would be 
considerably smaller than where n is the number of Entries as we have now. Too, 
where we're generating hashes while we're scanning the serialized Java class 
that should help in the cost savings even though we have added a new process. 
Luckily hashes are additive and can be made part of the process overall. If 
during the scanning and hash generation there are any errors with the serialized 
format, an error can be thrown as before where unmarshalling and object would 
have taken place.

Clustering, using something like Shoal, could then be possible with only the 
need to send the hashes, field names, types, and serialized format or the pieces 
required to do the job. This way each node in the cluster doesn't have to 
perform the heavier operations and instead just store them using the hash 
mechanisms. At least in a replicating cluster that is.

Anyways, this email is way too long as it is, but that seems like a good start. 

The other parts to figure out are the exact places locking would need to take 
place, how exactly the field references would be structured and stored, how much 
memory that then entails, but something like class_field_hashofvalue (this 
becomes a hashmap key) and the value pointed to is the hash used in the other 
hashmap for actual object lookup. That final hash can be used to pull out the 
serialized form from disk or directly from an in memory hashmap; too that hash 
is a hash of the overall serialized format and can be used to determine if the 
final value really needs to be stored again or not. With that there are some 
counters which need to be incorporated into some formal design.

So that's part of my idea. I'm still dwelling on the overall data structure and 
is why I haven't really commented more yet.

Wade

 ==================
Wade Chandler
Software Engineer and Developer
NetBeans Dream Team Member and Contributor

http://wiki.netbeans.org/wiki/view/NetBeansDreamTeam
http://www.netbeans.org

----- Original Message ----
> From: Mike McGrady <mm...@topiatechnology.com>
> To: "river-dev@incubator.apache.org" <ri...@incubator.apache.org>
> Cc: "river-dev@incubator.apache.org" <ri...@incubator.apache.org>
> Sent: Thu, December 16, 2010 10:04:55 AM
> Subject: Re: datastructure classes
> 
> Intense network systems like our clients cannot work with database calls.   
>They are too slow and do not scale.  They either reach a cost or a  performance 
>ceiling.
> 
> Sent from my iPhone
> 
> Michael  McGrady
> Principal investigator AF081_028 SBIR
> Chief Architect
> Topia  Technology, Inc
> Work 1.253.572.9712
> Cel 1.253.720.3365
> 
> On Dec 16,  2010, at 5:55 AM, Patricia Shanahan <pa...@acm.org> wrote:
> 
> > MICHAEL  MCGRADY wrote:
> >> Patricia,
> >> If you don't mind, I am going  to argue for sticking to the machines,
> >> but answer your question  roughly at the end.
> > 
> > I don't mind machines, as long as they get  somewhat quantified. It was
> > "machines a minute" that really gave me  trouble. However, there are
> > machines and machines, so I would rather  think in terms of transaction
> > rates.
> > 
> >> The questions  about either memory use or connections or transactions
> >> are stressors  leading to the question and the urgency whether or not
> >> something  must scale, but they have nothing or very little to do with
> >> whether  it will scale.
> >> If we try to involve multiple machines, then we will  discover
> >> frequently that, even if we do not need to scale, we cannot  because,
> >> for example, of Amdahl's Theorem.  If we cannot go to  multiple
> >> machines as a practical matter, we cannot scale no matter  what
> >> (ceteris paribus).  If we can, then we will find, it is  our
> >> experience, that we can scale no matter what (ceteris  paribus).
> > 
> > Scaling is exactly the reason why I think the current  FastList-based
> > design is seriously limited, and may need to be replaced  by something
> > that indexes its data.
> > 
> > Outrigger turns a  JavaSpace read into a linked
> > list scan of all the entries in the space  for the required type. Not too
> > bad if the list rarely changes and is  always small enough to fit in
> > cache. If it is big or changing, that is  going to cause a lot of memory
> > traffic.
> > 
> > As implemented,  it cannot be spread across machines very
> > effectively because the  additional scanning due to one machine not knowing 
>that another has already  found the entry would use up the added scan power.
> > 
> > I think to  scale it is going to need some sort of indexing, and as I
> > think about  indexing for the sort of ad-hoc query that JavaSpaces face,
> > combined  with frequent change, it starts looking very like maintaining
> > an indexed  database table.
> > 
> >> However, we should be able to do, say,  hundreds of millions of
> >> transactions in a day in real-time critical  systems such as the FAA
> >> or the stock market with data affinity and  integrity and all the
> >> other "ilities".  If Outrigger cannot do  this, it is of no interest
> >> to us.
> > 
> > The current  record for a relational database doing simple transactions
> > is 30 million  transactions per minute (Oracle/SPARC TPC-C). Your mileage
> > may vary, but  there is no inherent limit on relational database scaling
> > that puts a  few hundred thousand transactions per minute out of reach.
> > 
> > 
> >> MG
> >> On Dec 14, 2010, at 9:48 PM, Patricia Shanahan  wrote:
> >>> Please clarify "machines a minute". Can you express it  e.g. in
> >>> transactions per minute?
> >>>  Patricia
> >>> Mike McGrady wrote:
> >>>> Linear - 7000  machines a minute, or more. Sent from my iPhone Michael 
>McGrady Principal  investigator AF081_028 SBIR Chief
> >>>> Architect Topia  Technology, Inc Work 1.253.572.9712 Cel
> >>>> 1.253.720.3365 On  Dec 14, 2010, at 2:04 PM, Patricia Shanahan
> >>>> <pa...@acm.org>  wrote:
> >>>>> On 12/14/2010 1:49 PM, MICHAEL MCGRADY  wrote:
> >>>>>> On Dec 14, 2010, at 1:40 PM, Patricia  Shanahan wrote:
> >>>>> ...
> >>>>>>> If  we made a persistent version use a relational  database
> >>>>>>> to represent the  space,
> >>>>>> This would not be usable for us.  This  is too slow and does
> >>>>>> not have the correct QCC  features, especially scalability.
> >>>>>  ...
> >>>>> Could you put some numbers on what sort of  scalability you
> >>>>> require?
> >>>>>  Patricia
> >> Michael McGrady Chief Architect Topia Technology, Inc.  Cel
> >> 1.253.720.3365 Work 1.253.572.9712 extension 2037 
>mmcgrady@topiatechnology.com
> > 
> > 
> > 
>

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

Intense network systems like our clients cannot work with database calls.  They are too slow and do not scale.  They either reach a cost or a performance ceiling.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 16, 2010, at 5:55 AM, Patricia Shanahan <pa...@acm.org> wrote:

> MICHAEL MCGRADY wrote:
>> Patricia,
>> If you don't mind, I am going to argue for sticking to the machines,
>> but answer your question roughly at the end.
> 
> I don't mind machines, as long as they get somewhat quantified. It was
> "machines a minute" that really gave me trouble. However, there are
> machines and machines, so I would rather think in terms of transaction
> rates.
> 
>> The questions about either memory use or connections or transactions
>> are stressors leading to the question and the urgency whether or not
>> something must scale, but they have nothing or very little to do with
>> whether it will scale.
>> If we try to involve multiple machines, then we will discover
>> frequently that, even if we do not need to scale, we cannot because,
>> for example, of Amdahl's Theorem.  If we cannot go to multiple
>> machines as a practical matter, we cannot scale no matter what
>> (ceteris paribus).  If we can, then we will find, it is our
>> experience, that we can scale no matter what (ceteris paribus).
> 
> Scaling is exactly the reason why I think the current FastList-based
> design is seriously limited, and may need to be replaced by something
> that indexes its data.
> 
> Outrigger turns a JavaSpace read into a linked
> list scan of all the entries in the space for the required type. Not too
> bad if the list rarely changes and is always small enough to fit in
> cache. If it is big or changing, that is going to cause a lot of memory
> traffic.
> 
> As implemented, it cannot be spread across machines very
> effectively because the additional scanning due to one machine not knowing that another has already found the entry would use up the added scan power.
> 
> I think to scale it is going to need some sort of indexing, and as I
> think about indexing for the sort of ad-hoc query that JavaSpaces face,
> combined with frequent change, it starts looking very like maintaining
> an indexed database table.
> 
>> However, we should be able to do, say, hundreds of millions of
>> transactions in a day in real-time critical systems such as the FAA
>> or the stock market with data affinity and integrity and all the
>> other "ilities".  If Outrigger cannot do this, it is of no interest
>> to us.
> 
> The current record for a relational database doing simple transactions
> is 30 million transactions per minute (Oracle/SPARC TPC-C). Your mileage
> may vary, but there is no inherent limit on relational database scaling
> that puts a few hundred thousand transactions per minute out of reach.
> 
> 
>> MG
>> On Dec 14, 2010, at 9:48 PM, Patricia Shanahan wrote:
>>> Please clarify "machines a minute". Can you express it e.g. in
>>> transactions per minute?
>>> Patricia
>>> Mike McGrady wrote:
>>>> Linear - 7000 machines a minute, or more. Sent from my iPhone Michael McGrady Principal investigator AF081_028 SBIR Chief
>>>> Architect Topia Technology, Inc Work 1.253.572.9712 Cel
>>>> 1.253.720.3365 On Dec 14, 2010, at 2:04 PM, Patricia Shanahan
>>>> <pa...@acm.org> wrote:
>>>>> On 12/14/2010 1:49 PM, MICHAEL MCGRADY wrote:
>>>>>> On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:
>>>>> ...
>>>>>>> If we made a persistent version use a relational database
>>>>>>> to represent the space,
>>>>>> This would not be usable for us.  This is too slow and does
>>>>>> not have the correct QCC features, especially scalability.
>>>>> ...
>>>>> Could you put some numbers on what sort of scalability you
>>>>> require?
>>>>> Patricia
>> Michael McGrady Chief Architect Topia Technology, Inc. Cel
>> 1.253.720.3365 Work 1.253.572.9712 extension 2037 mmcgrady@topiatechnology.com
> 
> 
>

Re: datastructure classes

Posted by Sim IJskes - QCG <si...@qcg.nl>.

On 16-12-10 17:14, Mike McGrady wrote:
> I didn't intend to be contentious.  I have absolutely nothing against
> databases and am only giving my experience, and Oracles experience,
> in relation to this topic which I am greatly interested in.  I think
> I can convince the group/list this is not a good start.

Ofcourse i should have qualified my remarks with the context at hand. 
You don't dislike databases. I know that.

> Everyones thoughts could be stated and duly noted.  However, I will
> go off list if you like.  I am not here for heat, only light.  My
> apologies.

No need to go offlist. You have more to contribute than just your 
objections to using a relational db for spaces, what else should we 
watch out for?

Gr. Sim

-- 
QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

I didn't intend to be contentious.  I have absolutely nothing against databases and am only giving my experience, and Oracles experience, in relation to this topic which I am greatly interested in.  I think I can convince the group/list this is not a good start. 

Everyones thoughts could be stated and duly noted.  However, I will go off list if you like.  I am not here for heat, only light.  My apologies.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 16, 2010, at 7:21 AM, Sim IJskes - QCG <si...@qcg.nl> wrote:

> On 16-12-10 16:11, Mike McGrady wrote:
>> The experience in the industry is that writing directly to a database
>> is too slow and reaches either a cost or a performance ceiling .  The
>> prime candidates for a tuple-space application in the real world is
>> lost, I think, if you write to a database and not to at least a front
>> end cache.  CISCO, Oracle, etc are all going, going, gone in this
>> direction.  It's not an option in our case.
>> 
>>> Apart from that, it would be very interesting to see how a COTS DB
>>> backed javaspace whould behave in practice. And it could be the
>>> first step into producing alternative persistence mechanisms. In
>>> the early stage it would be comfortable to know we don't have to
>>> prove the correctness of a cots-db. In a later stage we can always
>>> look at lifting the transaction based blockstorage layer from derby
>>> or another java based db for instance.
> 
> As you have quoted an email i've send, i will assume that the quoting order got somehow confused.
> 
> Mike, i hope that you are not confusing the process vs the outcome. Driving down the issue that you don't like databases, should not influence a committers freedom to dabble with the idea.
> 
> A relational database for you is out of the question, noted. Shall we move on?
> 
> Gr. Sim
> 
> 
> -- 
> QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
> Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: datastructure classes

Posted by Sim IJskes - QCG <si...@qcg.nl>.

On 16-12-10 16:11, Mike McGrady wrote:
> The experience in the industry is that writing directly to a database
> is too slow and reaches either a cost or a performance ceiling .  The
> prime candidates for a tuple-space application in the real world is
> lost, I think, if you write to a database and not to at least a front
> end cache.  CISCO, Oracle, etc are all going, going, gone in this
> direction.  It's not an option in our case.
>
>> Apart from that, it would be very interesting to see how a COTS DB
>> backed javaspace whould behave in practice. And it could be the
>> first step into producing alternative persistence mechanisms. In
>> the early stage it would be comfortable to know we don't have to
>> prove the correctness of a cots-db. In a later stage we can always
>> look at lifting the transaction based blockstorage layer from derby
>> or another java based db for instance.

As you have quoted an email i've send, i will assume that the quoting 
order got somehow confused.

Mike, i hope that you are not confusing the process vs the outcome. 
Driving down the issue that you don't like databases, should not 
influence a committers freedom to dabble with the idea.

A relational database for you is out of the question, noted. Shall we 
move on?

Gr. Sim


-- 
QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.


Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 16, 2010, at 6:09 AM, Sim IJskes - QCG <si...@qcg.nl> wrote:

> On 16-12-10 14:55, Patricia Shanahan wrote:
>>> However, we should be able to do, say, hundreds of millions of
>>> transactions in a day in real-time critical systems such as the FAA
>>> or the stock market with data affinity and integrity and all the
>>> other "ilities". If Outrigger cannot do this, it is of no interest
>>> to us.
>> 
>> The current record for a relational database doing simple transactions
>> is 30 million transactions per minute (Oracle/SPARC TPC-C). Your mileage
>> may vary, but there is no inherent limit on relational database scaling
>> that puts a few hundred thousand transactions per minute out of reach.
> 

The experience in the industry is that writing directly to a database is too slow and reaches either a cost or a performance ceiling .  The prime candidates for a tuple-space application in the real world is lost, I think, if you write to a database and not to at least a front end cache.  CISCO, Oracle, etc are all going, going, gone in this direction.  It's not an option in our case.

> Apart from that, it would be very interesting to see how a COTS DB backed javaspace whould behave in practice. And it could be the first step into producing alternative persistence mechanisms. In the early stage it would be comfortable to know we don't have to prove the correctness of a cots-db. In a later stage we can always look at lifting the transaction based blockstorage layer from derby or another java based db for instance.
> 
> Gr. Sim
> 
> -- 
> QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
> Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: datastructure classes

Posted by Dan Creswell <da...@gmail.com>.

An old geek joke, related to the mad trademarking of just about anything....

On 21 December 2010 00:28, MICHAEL MCGRADY <mm...@topiatechnology.com>wrote:

> Dan, what is "(TM)"?
>
> On Dec 20, 2010, at 1:13 PM, Dan Creswell wrote:
>
> > Blitz too has optional FIFO-ness via configuration though it's absolutely
> a
> > performance killer for any decent concurrent load by virtue of the usual
> > suspects such as lock contention and the high chance of scanning entry's
> > that have just been taken by a thread just ahead in the queue.
> >
> > To be honest though, unless FIFO is spec'd officially as an option or a
> > default, having developers rely on magic such as FIFO by default is "very
> > bad" (TM).
> >
> > On 20 December 2010 20:30, Patricia Shanahan <pa...@acm.org> wrote:
> >
> >> Tom Hobbs wrote:
> >>
> >>> I know of at least one company which uses Outrigger specifically
> >>> because of it's fortuitous FIFO behaviour.  I'm trying to encourage
> >>> them to move from the Jini 2.1 code to the River release and losing
> >>> the FIFO-ness might be an issue for them.
> >>>
> >>> And yes I know, the spec doesn't specify FIFO, like I said, they
> >>> needed a FIFO space, and Outrigger "just happened" to behave like
> >>> that.
> >>>
> >>
> >> That confirms my suspicion that FIFO-ness is a useful property. I note
> that
> >> Gigaspaces has optional support for FIFO.
> >>
> >> My quick changes to try to fix the current bug will preserve it. If the
> >> long term design loses it for flat-out performance, we should make it
> >> available on a configuration basis.
> >>
> >> Patricia
> >>
> >>
> >>
>
> Michael McGrady
> Chief Architect
> Topia Technology, Inc.
> Cel 1.253.720.3365
> Work 1.253.572.9712 extension 2037
> mmcgrady@topiatechnology.com
>
>
>
>

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

Dan, what is "(TM)"?

On Dec 20, 2010, at 1:13 PM, Dan Creswell wrote:

> Blitz too has optional FIFO-ness via configuration though it's absolutely a
> performance killer for any decent concurrent load by virtue of the usual
> suspects such as lock contention and the high chance of scanning entry's
> that have just been taken by a thread just ahead in the queue.
> 
> To be honest though, unless FIFO is spec'd officially as an option or a
> default, having developers rely on magic such as FIFO by default is "very
> bad" (TM).
> 
> On 20 December 2010 20:30, Patricia Shanahan <pa...@acm.org> wrote:
> 
>> Tom Hobbs wrote:
>> 
>>> I know of at least one company which uses Outrigger specifically
>>> because of it's fortuitous FIFO behaviour.  I'm trying to encourage
>>> them to move from the Jini 2.1 code to the River release and losing
>>> the FIFO-ness might be an issue for them.
>>> 
>>> And yes I know, the spec doesn't specify FIFO, like I said, they
>>> needed a FIFO space, and Outrigger "just happened" to behave like
>>> that.
>>> 
>> 
>> That confirms my suspicion that FIFO-ness is a useful property. I note that
>> Gigaspaces has optional support for FIFO.
>> 
>> My quick changes to try to fix the current bug will preserve it. If the
>> long term design loses it for flat-out performance, we should make it
>> available on a configuration basis.
>> 
>> Patricia
>> 
>> 
>> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Dan Creswell <da...@gmail.com>.

Blitz too has optional FIFO-ness via configuration though it's absolutely a
performance killer for any decent concurrent load by virtue of the usual
suspects such as lock contention and the high chance of scanning entry's
that have just been taken by a thread just ahead in the queue.

To be honest though, unless FIFO is spec'd officially as an option or a
default, having developers rely on magic such as FIFO by default is "very
bad" (TM).

On 20 December 2010 20:30, Patricia Shanahan <pa...@acm.org> wrote:

> Tom Hobbs wrote:
>
>> I know of at least one company which uses Outrigger specifically
>> because of it's fortuitous FIFO behaviour.  I'm trying to encourage
>> them to move from the Jini 2.1 code to the River release and losing
>> the FIFO-ness might be an issue for them.
>>
>> And yes I know, the spec doesn't specify FIFO, like I said, they
>> needed a FIFO space, and Outrigger "just happened" to behave like
>> that.
>>
>
> That confirms my suspicion that FIFO-ness is a useful property. I note that
> Gigaspaces has optional support for FIFO.
>
> My quick changes to try to fix the current bug will preserve it. If the
> long term design loses it for flat-out performance, we should make it
> available on a configuration basis.
>
> Patricia
>
>
>

Re: datastructure classes

Posted by Greg Trasuk <tr...@stratuscom.com>.

Seems to me that FIFO handling would be a bit of an illusion at the best
of times.  What happens if you take an entry within a transaction, then
the transaction gets rolled back?  The entry gets dropped back into the
space, in a different order than the original.

I've always assumed that if you want to process entries in order, you
need to pull them all and re-order locally, or put a sequence number as
a field, then take() them in the order you want them.

By the way, I'm casually assuming the model where you're using the space
as a messaging medium rather than storage.  Your mileage may vary.

Cheers,

Greg.


On Mon, 2010-12-20 at 16:19, Dan Creswell wrote:
> If that's what is required, you'd best change the spec to reflect this
> default. Failing to do this leaves developers open to nasty surprises like:
> 
> (1) Develop code on javaspace that supports FIFO by default.
> (2) Pronounce code safe and debugged.
> (3) Deploy code onto some other javaspace (e.g. a commerically supported one
> to keep management happy).
> (4) Be assaulted and confused by the sudden appearance of problems with code
> that ran just fine before for no apparent reason.
> 
> As Tom hints below, this problem is already with us so sorting out specs and
> such cannot be a bad thing.
> 
> "But, without some sort of implicit ordering, there is a chance of
> starvation or at best prolonged staleness of entries which can turn people
> off of JavaSpaces as a solution for all out performant network applications,
> and that is exactly what we don't want."
> 
> I'd note that for anything other than the least demanding (concurrency wise)
> "performant network applications" FIFO is a performance killer that might
> well put developers off anyways. As with so many things Javaspaces there are
> two sides to every tradeoff and neither is perfect for all cases.
> 
> On 20 December 2010 20:41, Gregg Wonderly <gr...@wonderly.org> wrote:
> 
> > On 12/20/2010 2:30 PM, Patricia Shanahan wrote:
> >
> >> Tom Hobbs wrote:
> >>
> >>> I know of at least one company which uses Outrigger specifically
> >>> because of it's fortuitous FIFO behaviour. I'm trying to encourage
> >>> them to move from the Jini 2.1 code to the River release and losing
> >>> the FIFO-ness might be an issue for them.
> >>>
> >>> And yes I know, the spec doesn't specify FIFO, like I said, they
> >>> needed a FIFO space, and Outrigger "just happened" to behave like
> >>> that.
> >>>
> >>
> >> That confirms my suspicion that FIFO-ness is a useful property. I note
> >> that
> >> Gigaspaces has optional support for FIFO.
> >>
> >> My quick changes to try to fix the current bug will preserve it. If the
> >> long
> >> term design loses it for flat-out performance, we should make it available
> >> on a
> >> configuration basis.
> >>
> >
> > In any case, I think it is important to think strongly about supporting
> > FIFO as the default behavior.  This helps people see and appreciate
> > "progress" of
> > Entry values within their system.  If better, predictable, performance can
> > be had from non-FIFO ordering, that would be great.  But, without some sort
> > of implicit ordering, there is a chance of starvation or at best prolonged
> > staleness of entries which can turn people off of JavaSpaces as a solution
> > for all out performant network applications, and that is exactly what we
> > don't want.
> >
> > Gregg
> >
-- 
Greg Trasuk, President
StratusCom Manufacturing Systems Inc. - We use information technology to
solve business problems on your plant floor.
http://stratuscom.com

Re: datastructure classes

Posted by Dan Creswell <da...@gmail.com>.

It's a matter of what you want to say to the users:

(1) Putting it in the Javaspaces spec allows users to be (not entirely but
reasonably) confident of consistent behaviour across all implementations
such that their code will run on any one they choose with no change.

(2) Putting it in the Outrigger spec means users must adopt a "buyer beware"
stance and understand that their code may end up explicitly or by accident
dependent on Outrigger's specific implementation(*)

If you're asking me for an opinion, I'd say option (2) is the right choice
and it ought to be a config option. On or off, by default? Dunno, I've seen
some users who want FIFO by default and some that don't.

(*) In the case of FIFO, if we make it configurable but specify no default
then it's entirely possible that Outrigger defaults to FIFO and e.g. Blitz
doesn't, which again will surprise users.

On 21 December 2010 01:24, Patricia Shanahan <pa...@acm.org> wrote:

> MICHAEL MCGRADY wrote:
>
>> What is the spec that is being potentially changed?  Outrigger can
>> promise FIFO but this does not mean, I assume, that other
>> implementations of JavaSpaces have to deliver FIFO as well.  Pulling
>> an implementation into the specs for JavaSpaces would, I think, be
>> counter-productive.
>>
>
> Good question. Perhaps Outrigger package javadocs? We definitely need
> somewhere for documentation that is specific to Outrigger beyond the fact
> that it is an implementation of the JavaSpaces specification.
>
> Patricia
>
>

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

MICHAEL MCGRADY wrote:
> What is the spec that is being potentially changed?  Outrigger can
> promise FIFO but this does not mean, I assume, that other
> implementations of JavaSpaces have to deliver FIFO as well.  Pulling
> an implementation into the specs for JavaSpaces would, I think, be
> counter-productive.

Good question. Perhaps Outrigger package javadocs? We definitely need 
somewhere for documentation that is specific to Outrigger beyond the 
fact that it is an implementation of the JavaSpaces specification.

Patricia

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

What is the spec that is being potentially changed?  Outrigger can promise FIFO but this does not mean, I assume, that other implementations of JavaSpaces have to deliver FIFO as well.  Pulling an implementation into the specs for JavaSpaces would, I think, be counter-productive.

MG

On Dec 20, 2010, at 1:19 PM, Dan Creswell wrote:

> If that's what is required, you'd best change the spec
> to reflect this
> default. Failing to do this leaves developers open to nasty surprises like:
> 
> (1) Develop code on javaspace that supports FIFO by default.
> (2) Pronounce code safe and debugged.
> (3) Deploy code onto some other javaspace (e.g. a commerically supported one
> to keep management happy).
> (4) Be assaulted and confused by the sudden appearance of problems with code
> that ran just fine before for no apparent reason.
> 
> As Tom hints below, this problem is already with us so sorting out specs and
> such cannot be a bad thing.
> 
> "But, without some sort of implicit ordering, there is a chance of
> starvation or at best prolonged staleness of entries which can turn people
> off of JavaSpaces as a solution for all out performant network applications,
> and that is exactly what we don't want."
> 
> I'd note that for anything other than the least demanding (concurrency wise)
> "performant network applications" FIFO is a performance killer that might
> well put developers off anyways. As with so many things Javaspaces there are
> two sides to every tradeoff and neither is perfect for all cases.
> 
> On 20 December 2010 20:41, Gregg Wonderly <gr...@wonderly.org> wrote:
> 
>> On 12/20/2010 2:30 PM, Patricia Shanahan wrote:
>> 
>>> Tom Hobbs wrote:
>>> 
>>>> I know of at least one company which uses Outrigger specifically
>>>> because of it's fortuitous FIFO behaviour. I'm trying to encourage
>>>> them to move from the Jini 2.1 code to the River release and losing
>>>> the FIFO-ness might be an issue for them.
>>>> 
>>>> And yes I know, the spec doesn't specify FIFO, like I said, they
>>>> needed a FIFO space, and Outrigger "just happened" to behave like
>>>> that.
>>>> 
>>> 
>>> That confirms my suspicion that FIFO-ness is a useful property. I note
>>> that
>>> Gigaspaces has optional support for FIFO.
>>> 
>>> My quick changes to try to fix the current bug will preserve it. If the
>>> long
>>> term design loses it for flat-out performance, we should make it available
>>> on a
>>> configuration basis.
>>> 
>> 
>> In any case, I think it is important to think strongly about supporting
>> FIFO as the default behavior.  This helps people see and appreciate
>> "progress" of
>> Entry values within their system.  If better, predictable, performance can
>> be had from non-FIFO ordering, that would be great.  But, without some sort
>> of implicit ordering, there is a chance of starvation or at best prolonged
>> staleness of entries which can turn people off of JavaSpaces as a solution
>> for all out performant network applications, and that is exactly what we
>> don't want.
>> 
>> Gregg
>> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

I also agree.

MG

On Dec 20, 2010, at 2:22 PM, Gregg Wonderly wrote:

> On 12/20/2010 3:19 PM, Dan Creswell wrote:
>> If that's what is required, you'd best change the spec to reflect this
>> default. Failing to do this leaves developers open to nasty surprises like:
>> 
>> (1) Develop code on javaspace that supports FIFO by default.
>> (2) Pronounce code safe and debugged.
>> (3) Deploy code onto some other javaspace (e.g. a commerically supported one
>> to keep management happy).
>> (4) Be assaulted and confused by the sudden appearance of problems with code
>> that ran just fine before for no apparent reason.
>> 
>> As Tom hints below, this problem is already with us so sorting out specs and
>> such cannot be a bad thing.
>> 
>> "But, without some sort of implicit ordering, there is a chance of
>> starvation or at best prolonged staleness of entries which can turn people
>> off of JavaSpaces as a solution for all out performant network applications,
>> and that is exactly what we don't want."
>> 
>> I'd note that for anything other than the least demanding (concurrency wise)
>> "performant network applications" FIFO is a performance killer that might
>> well put developers off anyways. As with so many things Javaspaces there are
>> two sides to every tradeoff and neither is perfect for all cases.
> 
> I completely agree.  There are many avenues of success and failure each with both detracting and meritorious viewpoints.
> 
> Gregg
> 
>> On 20 December 2010 20:41, Gregg Wonderly<gr...@wonderly.org>  wrote:
>> 
>>> On 12/20/2010 2:30 PM, Patricia Shanahan wrote:
>>> 
>>>> Tom Hobbs wrote:
>>>> 
>>>>> I know of at least one company which uses Outrigger specifically
>>>>> because of it's fortuitous FIFO behaviour. I'm trying to encourage
>>>>> them to move from the Jini 2.1 code to the River release and losing
>>>>> the FIFO-ness might be an issue for them.
>>>>> 
>>>>> And yes I know, the spec doesn't specify FIFO, like I said, they
>>>>> needed a FIFO space, and Outrigger "just happened" to behave like
>>>>> that.
>>>>> 
>>>> 
>>>> That confirms my suspicion that FIFO-ness is a useful property. I note
>>>> that
>>>> Gigaspaces has optional support for FIFO.
>>>> 
>>>> My quick changes to try to fix the current bug will preserve it. If the
>>>> long
>>>> term design loses it for flat-out performance, we should make it available
>>>> on a
>>>> configuration basis.
>>>> 
>>> 
>>> In any case, I think it is important to think strongly about supporting
>>> FIFO as the default behavior.  This helps people see and appreciate
>>> "progress" of
>>> Entry values within their system.  If better, predictable, performance can
>>> be had from non-FIFO ordering, that would be great.  But, without some sort
>>> of implicit ordering, there is a chance of starvation or at best prolonged
>>> staleness of entries which can turn people off of JavaSpaces as a solution
>>> for all out performant network applications, and that is exactly what we
>>> don't want.
>>> 
>>> Gregg
>>> 
>> 
> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/20/2010 3:19 PM, Dan Creswell wrote:
> If that's what is required, you'd best change the spec to reflect this
> default. Failing to do this leaves developers open to nasty surprises like:
>
> (1) Develop code on javaspace that supports FIFO by default.
> (2) Pronounce code safe and debugged.
> (3) Deploy code onto some other javaspace (e.g. a commerically supported one
> to keep management happy).
> (4) Be assaulted and confused by the sudden appearance of problems with code
> that ran just fine before for no apparent reason.
>
> As Tom hints below, this problem is already with us so sorting out specs and
> such cannot be a bad thing.
>
> "But, without some sort of implicit ordering, there is a chance of
> starvation or at best prolonged staleness of entries which can turn people
> off of JavaSpaces as a solution for all out performant network applications,
> and that is exactly what we don't want."
>
> I'd note that for anything other than the least demanding (concurrency wise)
> "performant network applications" FIFO is a performance killer that might
> well put developers off anyways. As with so many things Javaspaces there are
> two sides to every tradeoff and neither is perfect for all cases.

I completely agree.  There are many avenues of success and failure each with 
both detracting and meritorious viewpoints.

Gregg

> On 20 December 2010 20:41, Gregg Wonderly<gr...@wonderly.org>  wrote:
>
>> On 12/20/2010 2:30 PM, Patricia Shanahan wrote:
>>
>>> Tom Hobbs wrote:
>>>
>>>> I know of at least one company which uses Outrigger specifically
>>>> because of it's fortuitous FIFO behaviour. I'm trying to encourage
>>>> them to move from the Jini 2.1 code to the River release and losing
>>>> the FIFO-ness might be an issue for them.
>>>>
>>>> And yes I know, the spec doesn't specify FIFO, like I said, they
>>>> needed a FIFO space, and Outrigger "just happened" to behave like
>>>> that.
>>>>
>>>
>>> That confirms my suspicion that FIFO-ness is a useful property. I note
>>> that
>>> Gigaspaces has optional support for FIFO.
>>>
>>> My quick changes to try to fix the current bug will preserve it. If the
>>> long
>>> term design loses it for flat-out performance, we should make it available
>>> on a
>>> configuration basis.
>>>
>>
>> In any case, I think it is important to think strongly about supporting
>> FIFO as the default behavior.  This helps people see and appreciate
>> "progress" of
>> Entry values within their system.  If better, predictable, performance can
>> be had from non-FIFO ordering, that would be great.  But, without some sort
>> of implicit ordering, there is a chance of starvation or at best prolonged
>> staleness of entries which can turn people off of JavaSpaces as a solution
>> for all out performant network applications, and that is exactly what we
>> don't want.
>>
>> Gregg
>>
>

Re: datastructure classes

Posted by Dan Creswell <da...@gmail.com>.

If that's what is required, you'd best change the spec to reflect this
default. Failing to do this leaves developers open to nasty surprises like:

(1) Develop code on javaspace that supports FIFO by default.
(2) Pronounce code safe and debugged.
(3) Deploy code onto some other javaspace (e.g. a commerically supported one
to keep management happy).
(4) Be assaulted and confused by the sudden appearance of problems with code
that ran just fine before for no apparent reason.

As Tom hints below, this problem is already with us so sorting out specs and
such cannot be a bad thing.

"But, without some sort of implicit ordering, there is a chance of
starvation or at best prolonged staleness of entries which can turn people
off of JavaSpaces as a solution for all out performant network applications,
and that is exactly what we don't want."

I'd note that for anything other than the least demanding (concurrency wise)
"performant network applications" FIFO is a performance killer that might
well put developers off anyways. As with so many things Javaspaces there are
two sides to every tradeoff and neither is perfect for all cases.

On 20 December 2010 20:41, Gregg Wonderly <gr...@wonderly.org> wrote:

> On 12/20/2010 2:30 PM, Patricia Shanahan wrote:
>
>> Tom Hobbs wrote:
>>
>>> I know of at least one company which uses Outrigger specifically
>>> because of it's fortuitous FIFO behaviour. I'm trying to encourage
>>> them to move from the Jini 2.1 code to the River release and losing
>>> the FIFO-ness might be an issue for them.
>>>
>>> And yes I know, the spec doesn't specify FIFO, like I said, they
>>> needed a FIFO space, and Outrigger "just happened" to behave like
>>> that.
>>>
>>
>> That confirms my suspicion that FIFO-ness is a useful property. I note
>> that
>> Gigaspaces has optional support for FIFO.
>>
>> My quick changes to try to fix the current bug will preserve it. If the
>> long
>> term design loses it for flat-out performance, we should make it available
>> on a
>> configuration basis.
>>
>
> In any case, I think it is important to think strongly about supporting
> FIFO as the default behavior.  This helps people see and appreciate
> "progress" of
> Entry values within their system.  If better, predictable, performance can
> be had from non-FIFO ordering, that would be great.  But, without some sort
> of implicit ordering, there is a chance of starvation or at best prolonged
> staleness of entries which can turn people off of JavaSpaces as a solution
> for all out performant network applications, and that is exactly what we
> don't want.
>
> Gregg
>

Re: datastructure classes

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/20/2010 2:30 PM, Patricia Shanahan wrote:
> Tom Hobbs wrote:
>> I know of at least one company which uses Outrigger specifically
>> because of it's fortuitous FIFO behaviour. I'm trying to encourage
>> them to move from the Jini 2.1 code to the River release and losing
>> the FIFO-ness might be an issue for them.
>>
>> And yes I know, the spec doesn't specify FIFO, like I said, they
>> needed a FIFO space, and Outrigger "just happened" to behave like
>> that.
>
> That confirms my suspicion that FIFO-ness is a useful property. I note that
> Gigaspaces has optional support for FIFO.
>
> My quick changes to try to fix the current bug will preserve it. If the long
> term design loses it for flat-out performance, we should make it available on a
> configuration basis.

In any case, I think it is important to think strongly about supporting FIFO as 
the default behavior.  This helps people see and appreciate "progress" of
Entry values within their system.  If better, predictable, performance can be 
had from non-FIFO ordering, that would be great.  But, without some sort of 
implicit ordering, there is a chance of starvation or at best prolonged 
staleness of entries which can turn people off of JavaSpaces as a solution for 
all out performant network applications, and that is exactly what we don't want.

Gregg

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

Tom Hobbs wrote:
> I know of at least one company which uses Outrigger specifically
> because of it's fortuitous FIFO behaviour.  I'm trying to encourage
> them to move from the Jini 2.1 code to the River release and losing
> the FIFO-ness might be an issue for them.
> 
> And yes I know, the spec doesn't specify FIFO, like I said, they
> needed a FIFO space, and Outrigger "just happened" to behave like
> that.

That confirms my suspicion that FIFO-ness is a useful property. I note 
that Gigaspaces has optional support for FIFO.

My quick changes to try to fix the current bug will preserve it. If the 
long term design loses it for flat-out performance, we should make it 
available on a configuration basis.

Patricia

Re: datastructure classes

Posted by Tom Hobbs <tv...@googlemail.com>.

I know of at least one company which uses Outrigger specifically
because of it's fortuitous FIFO behaviour.  I'm trying to encourage
them to move from the Jini 2.1 code to the River release and losing
the FIFO-ness might be an issue for them.

And yes I know, the spec doesn't specify FIFO, like I said, they
needed a FIFO space, and Outrigger "just happened" to behave like
that.

On Sat, Dec 18, 2010 at 12:03 AM, Patricia Shanahan <pa...@acm.org> wrote:
> On 12/17/2010 8:00 AM, Gregg Wonderly wrote:
> ...
>>
>> One of the primary issues with bandwidth through any system is latency.
>> While multiprocessor/multi-core and distributed computing can provide
>> huge bandwidth possibilities, the underlying issue is per transaction
>> latency.
>>
>> If you look simply, at the JDBC model for example, the act of converting
>> the JDBC activities to network traffic and back (marshal/unmarshal) is
>> one of the primary "time consuming operations". When you add onto that
>> other overhead associated with how each database authenticates and
>> manages its "network" traffic. I have no exact numbers to demonstrate
>> this, but it is something which I have very long experience dealing
>> with, over the years, with my broker (built before JMS existed) and
>> catching up 100's of thousands of transactions into databases which have
>> gone down for maintenance.
>>
>> JDBC only allows one transaction per connection, so you have context
>> overhead involved there. With each transaction coming across a separate
>> TNS-Listener process in oracle, you have OS context switching issues
>> that inject latency.
>>
>> Overall, the bandwidth can be very large, but the per transaction
>> latency is probably the biggest reason that SQL databases are not always
>> the best choice for some types of performance needs.
>
> Thanks. It's very useful to know specific gotchas with a given
> implementation strategy.
>
> Patricia
>

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/17/2010 8:00 AM, Gregg Wonderly wrote:
...
> One of the primary issues with bandwidth through any system is latency.
> While multiprocessor/multi-core and distributed computing can provide
> huge bandwidth possibilities, the underlying issue is per transaction
> latency.
>
> If you look simply, at the JDBC model for example, the act of converting
> the JDBC activities to network traffic and back (marshal/unmarshal) is
> one of the primary "time consuming operations". When you add onto that
> other overhead associated with how each database authenticates and
> manages its "network" traffic. I have no exact numbers to demonstrate
> this, but it is something which I have very long experience dealing
> with, over the years, with my broker (built before JMS existed) and
> catching up 100's of thousands of transactions into databases which have
> gone down for maintenance.
>
> JDBC only allows one transaction per connection, so you have context
> overhead involved there. With each transaction coming across a separate
> TNS-Listener process in oracle, you have OS context switching issues
> that inject latency.
>
> Overall, the bandwidth can be very large, but the per transaction
> latency is probably the biggest reason that SQL databases are not always
> the best choice for some types of performance needs.

Thanks. It's very useful to know specific gotchas with a given 
implementation strategy.

Patricia

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

I concur.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 17, 2010, at 8:00 AM, Gregg Wonderly <gr...@wonderly.org> wrote:

> On 12/16/2010 8:09 AM, Sim IJskes - QCG wrote:
>> On 16-12-10 14:55, Patricia Shanahan wrote:
>>>> However, we should be able to do, say, hundreds of millions of
>>>> transactions in a day in real-time critical systems such as the FAA
>>>> or the stock market with data affinity and integrity and all the
>>>> other "ilities". If Outrigger cannot do this, it is of no interest
>>>> to us.
>>> 
>>> The current record for a relational database doing simple transactions
>>> is 30 million transactions per minute (Oracle/SPARC TPC-C). Your mileage
>>> may vary, but there is no inherent limit on relational database scaling
>>> that puts a few hundred thousand transactions per minute out of reach.
>> 
>> Apart from that, it would be very interesting to see how a COTS DB backed
>> javaspace whould behave in practice. And it could be the first step into
>> producing alternative persistence mechanisms. In the early stage it would be
>> comfortable to know we don't have to prove the correctness of a cots-db. In a
>> later stage we can always look at lifting the transaction based blockstorage
>> layer from derby or another java based db for instance.
> 
> One of the primary issues with bandwidth through any system is latency.  While multiprocessor/multi-core and distributed computing can provide huge bandwidth possibilities, the underlying issue is per transaction latency.
> 
> If you look simply, at the JDBC model for example, the act of converting the JDBC activities to network traffic and back (marshal/unmarshal) is one of the primary "time consuming operations".  When you add onto that other overhead associated with how each database authenticates and manages its "network" traffic.  I have no exact numbers to demonstrate this, but it is something which I have very long experience dealing with, over the years, with my broker (built before JMS existed) and catching up 100's of thousands of transactions into databases which have gone down for maintenance.
> 
> JDBC only allows one transaction per connection, so you have context overhead involved there.  With each transaction coming across a separate TNS-Listener process in oracle, you have OS context switching issues that inject latency.
> 
> Overall, the bandwidth can be very large, but the per transaction latency is probably the biggest reason that SQL databases are not always the best choice for some types of performance needs.
> 
> Gregg Wonderly

Re: datastructure classes

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/16/2010 8:09 AM, Sim IJskes - QCG wrote:
> On 16-12-10 14:55, Patricia Shanahan wrote:
>>> However, we should be able to do, say, hundreds of millions of
>>> transactions in a day in real-time critical systems such as the FAA
>>> or the stock market with data affinity and integrity and all the
>>> other "ilities". If Outrigger cannot do this, it is of no interest
>>> to us.
>>
>> The current record for a relational database doing simple transactions
>> is 30 million transactions per minute (Oracle/SPARC TPC-C). Your mileage
>> may vary, but there is no inherent limit on relational database scaling
>> that puts a few hundred thousand transactions per minute out of reach.
>
> Apart from that, it would be very interesting to see how a COTS DB backed
> javaspace whould behave in practice. And it could be the first step into
> producing alternative persistence mechanisms. In the early stage it would be
> comfortable to know we don't have to prove the correctness of a cots-db. In a
> later stage we can always look at lifting the transaction based blockstorage
> layer from derby or another java based db for instance.

One of the primary issues with bandwidth through any system is latency.  While 
multiprocessor/multi-core and distributed computing can provide huge bandwidth 
possibilities, the underlying issue is per transaction latency.

If you look simply, at the JDBC model for example, the act of converting the 
JDBC activities to network traffic and back (marshal/unmarshal) is one of the 
primary "time consuming operations".  When you add onto that other overhead 
associated with how each database authenticates and manages its "network" 
traffic.  I have no exact numbers to demonstrate this, but it is something which 
I have very long experience dealing with, over the years, with my broker (built 
before JMS existed) and catching up 100's of thousands of transactions into 
databases which have gone down for maintenance.

JDBC only allows one transaction per connection, so you have context overhead 
involved there.  With each transaction coming across a separate TNS-Listener 
process in oracle, you have OS context switching issues that inject latency.

Overall, the bandwidth can be very large, but the per transaction latency is 
probably the biggest reason that SQL databases are not always the best choice 
for some types of performance needs.

Gregg Wonderly

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

I am not saying, however, that we don't do transactions.  It is also critical and so to with replication and partitioning.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 16, 2010, at 6:09 AM, Sim IJskes - QCG <si...@qcg.nl> wrote:

> On 16-12-10 14:55, Patricia Shanahan wrote:
>>> However, we should be able to do, say, hundreds of millions of
>>> transactions in a day in real-time critical systems such as the FAA
>>> or the stock market with data affinity and integrity and all the
>>> other "ilities". If Outrigger cannot do this, it is of no interest
>>> to us.
>> 
>> The current record for a relational database doing simple transactions
>> is 30 million transactions per minute (Oracle/SPARC TPC-C). Your mileage
>> may vary, but there is no inherent limit on relational database scaling
>> that puts a few hundred thousand transactions per minute out of reach.
> 
> Apart from that, it would be very interesting to see how a COTS DB backed javaspace whould behave in practice. And it could be the first step into producing alternative persistence mechanisms. In the early stage it would be comfortable to know we don't have to prove the correctness of a cots-db. In a later stage we can always look at lifting the transaction based blockstorage layer from derby or another java based db for instance.
> 
> Gr. Sim
> 
> -- 
> QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
> Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: datastructure classes

Posted by Sim IJskes - QCG <si...@qcg.nl>.

On 16-12-10 14:55, Patricia Shanahan wrote:
>> However, we should be able to do, say, hundreds of millions of
>> transactions in a day in real-time critical systems such as the FAA
>> or the stock market with data affinity and integrity and all the
>> other "ilities". If Outrigger cannot do this, it is of no interest
>> to us.
>
> The current record for a relational database doing simple transactions
> is 30 million transactions per minute (Oracle/SPARC TPC-C). Your mileage
> may vary, but there is no inherent limit on relational database scaling
> that puts a few hundred thousand transactions per minute out of reach.

Apart from that, it would be very interesting to see how a COTS DB 
backed javaspace whould behave in practice. And it could be the first 
step into producing alternative persistence mechanisms. In the early 
stage it would be comfortable to know we don't have to prove the 
correctness of a cots-db. In a later stage we can always look at lifting 
the transaction based blockstorage layer from derby or another java 
based db for instance.

Gr. Sim

-- 
QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

MICHAEL MCGRADY wrote:
> Patricia,
> 
> If you don't mind, I am going to argue for sticking to the machines,
> but answer your question roughly at the end.

I don't mind machines, as long as they get somewhat quantified. It was
"machines a minute" that really gave me trouble. However, there are
machines and machines, so I would rather think in terms of transaction
rates.

> 
> The questions about either memory use or connections or transactions
> are stressors leading to the question and the urgency whether or not
> something must scale, but they have nothing or very little to do with
> whether it will scale.
> 
> If we try to involve multiple machines, then we will discover
> frequently that, even if we do not need to scale, we cannot because,
> for example, of Amdahl's Theorem.  If we cannot go to multiple
> machines as a practical matter, we cannot scale no matter what
> (ceteris paribus).  If we can, then we will find, it is our
> experience, that we can scale no matter what (ceteris paribus).

Scaling is exactly the reason why I think the current FastList-based
design is seriously limited, and may need to be replaced by something
that indexes its data.

Outrigger turns a JavaSpace read into a linked
list scan of all the entries in the space for the required type. Not too
bad if the list rarely changes and is always small enough to fit in
cache. If it is big or changing, that is going to cause a lot of memory
traffic.

As implemented, it cannot be spread across machines very
effectively because the additional scanning due to one machine not 
knowing that another has already found the entry would use up the added 
scan power.

I think to scale it is going to need some sort of indexing, and as I
think about indexing for the sort of ad-hoc query that JavaSpaces face,
combined with frequent change, it starts looking very like maintaining
an indexed database table.

> However, we should be able to do, say, hundreds of millions of
> transactions in a day in real-time critical systems such as the FAA
> or the stock market with data affinity and integrity and all the
> other "ilities".  If Outrigger cannot do this, it is of no interest
> to us.

The current record for a relational database doing simple transactions
is 30 million transactions per minute (Oracle/SPARC TPC-C). Your mileage
may vary, but there is no inherent limit on relational database scaling
that puts a few hundred thousand transactions per minute out of reach.

> 
> MG
> 
> 
> On Dec 14, 2010, at 9:48 PM, Patricia Shanahan wrote:
> 
>> Please clarify "machines a minute". Can you express it e.g. in
>> transactions per minute?
>> 
>> Patricia
>> 
>> Mike McGrady wrote:
>>> Linear - 7000 machines a minute, or more. Sent from my iPhone 
>>> Michael McGrady Principal investigator AF081_028 SBIR Chief
>>> Architect Topia Technology, Inc Work 1.253.572.9712 Cel
>>> 1.253.720.3365 On Dec 14, 2010, at 2:04 PM, Patricia Shanahan
>>> <pa...@acm.org> wrote:
>>>> On 12/14/2010 1:49 PM, MICHAEL MCGRADY wrote:
>>>>> On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:
>>>> ...
>>>>>> If we made a persistent version use a relational database
>>>>>> to represent the space,
>>>>> This would not be usable for us.  This is too slow and does
>>>>> not have the correct QCC features, especially scalability.
>>>> ...
>>>> 
>>>> Could you put some numbers on what sort of scalability you
>>>> require?
>>>> 
>>>> Patricia
> 
> Michael McGrady Chief Architect Topia Technology, Inc. Cel
> 1.253.720.3365 Work 1.253.572.9712 extension 2037 
> mmcgrady@topiatechnology.com
> 
> 
> 
>

Re: datastructure classes

Posted by MICHAEL MCGRADY <MM...@topiatechnology.com>.

Patricia,

If you don't mind, I am going to argue for sticking to the machines, but answer your question roughly at the end.

The questions about either memory use or connections or transactions are stressors leading to the question and the urgency whether or not something must scale, but they have nothing or very little to do with whether it will scale.  

If we try to involve multiple machines, then we will discover frequently that, even if we do not need to scale, we cannot because, for example, of Amdahl's Theorem.  If we cannot go to multiple machines as a practical matter, we cannot scale no matter what (ceteris paribus).  If we can, then we will find, it is our experience, that we can scale no matter what (ceteris paribus).  

However, we should be able to do, say, hundreds of millions of transactions in a day in real-time critical systems such as the FAA or the stock market with data affinity and integrity and all the other "ilities".  If Outrigger cannot do this, it is of no interest to us. 

MG

On Dec 14, 2010, at 9:48 PM, Patricia Shanahan wrote:

> Please clarify "machines a minute". Can you express it e.g. in transactions per minute?
> 
> Patricia
> 
> Mike McGrady wrote:
>> Linear - 7000 machines a minute, or more.
>> Sent from my iPhone
>> Michael McGrady
>> Principal investigator AF081_028 SBIR
>> Chief Architect
>> Topia Technology, Inc
>> Work 1.253.572.9712
>> Cel 1.253.720.3365
>> On Dec 14, 2010, at 2:04 PM, Patricia Shanahan <pa...@acm.org> wrote:
>>> On 12/14/2010 1:49 PM, MICHAEL MCGRADY wrote:
>>>> On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:
>>> ...
>>>>> If we made a persistent version use a relational database to represent
>>>>> the space,
>>>> This would not be usable for us.  This is too slow and does not have the correct QCC features, especially scalability.
>>> ...
>>> 
>>> Could you put some numbers on what sort of scalability you require?
>>> 
>>> Patricia
> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

Please clarify "machines a minute". Can you express it e.g. in 
transactions per minute?

Patricia

Mike McGrady wrote:
> Linear - 7000 machines a minute, or more.
> 
> Sent from my iPhone
> 
> Michael McGrady
> Principal investigator AF081_028 SBIR
> Chief Architect
> Topia Technology, Inc
> Work 1.253.572.9712
> Cel 1.253.720.3365
> 
> On Dec 14, 2010, at 2:04 PM, Patricia Shanahan <pa...@acm.org> wrote:
> 
>> On 12/14/2010 1:49 PM, MICHAEL MCGRADY wrote:
>>> On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:
>> ...
>>>> If we made a persistent version use a relational database to represent
>>>> the space,
>>> This would not be usable for us.  This is too slow and does not have the correct QCC features, especially scalability.
>> ...
>>
>> Could you put some numbers on what sort of scalability you require?
>>
>> Patricia
>

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

Linear - 7000 machines a minute, or more.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 14, 2010, at 2:04 PM, Patricia Shanahan <pa...@acm.org> wrote:

> On 12/14/2010 1:49 PM, MICHAEL MCGRADY wrote:
>> 
>> On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:
> ...
>>> If we made a persistent version use a relational database to represent
>>> the space,
>> 
>> This would not be usable for us.  This is too slow and does not have the correct QCC features, especially scalability.
> ...
> 
> Could you put some numbers on what sort of scalability you require?
> 
> Patricia

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/14/2010 1:49 PM, MICHAEL MCGRADY wrote:
>
> On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:
...
>> If we made a persistent version use a relational database to represent
>> the space,
>
> This would not be usable for us.  This is too slow and does not have the correct QCC features, especially scalability.
...

Could you put some numbers on what sort of scalability you require?

Patricia

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:

> On 12/14/2010 8:37 AM, Gregg Wonderly wrote:
>> On 12/14/2010 1:36 AM, MICHAEL MCGRADY wrote:
>>> I would say that in addition to just be a fast data structure the data
>>> structure
>> > must be fast and accommodate synchronous and asynchronous backups,
>> partitions,
>> > and transactions.
>> 
>> This is an important issue from the perspective that there are two
>> scenarios that used to be supported by outrigger. A persistent and an
>> non-persistent version used to exist. The persistent version used PSE
>> for serialization to disk. That was a simple yet powerful mechanism. Due
>> to licensing (Sun paid for a distribution license), it was in a sense,
>> deprecated at the point of River being started.
>> 
>> For those that don't know about PSE, it used a post compilation bytecode
>> manipulator that looked for calls to a "start transaction" method, and
>> then found modification assignments to associated data structures, and
>> modified the byte code to set a "modified bit" on the associated data.
>> When "end transaction" was encountered, it stopped.
>> 
>> I think it would be a good idea to focus on the performance of the in
>> memory (messaging only type of application) version. The persistent
>> version is a completely different animal and requires some fairly
>> advanced features for managing all of the appropriate control points.
>> Making one code path do both can be somewhat challenging from an all out
>> performance perspective.

I agree, useful.  Very useful.


>> 
> 
> Thanks for the useful background information.
> 
> There is one slim hope I can see for a common code path, but it is a
> very long way off.
> 
> My prejudice, subject to being convinced that another approach would be
> better, would be to try to map a persistent version to a relational
> database through SQL. Relational databases deal with transactions, ACID,
> distribution, and performance issues. There are a lot of options for
> users, more than for OO databases, at all price points starting at free.


This would not be in the long run something we could use.  We need to do ACID transactions with in-memory, persistent, functionality.  And, we need to have a common data model that can link to SQL, etc, legacy (or new) systems.  

> 
> The way outrigger uses its FastList looks rather like a sort of
> simplified relational database, with each FastList instance representing
> a table and selects being done by linear scan of the table.
> 
> If we made a persistent version use a relational database to represent
> the space,

This would not be usable for us.  This is too slow and does not have the correct QCC features, especially scalability.

> we could then experiment with performance run-offs between
> our best shot at an ad-hoc in-memory implementation, and what we get
> from the persistent version if we drop in an in-memory database
> implementation. If they come close, we could drop the ad-hoc
> implementation and focus all effort on the relational database version.
> 
> It is a slim hope. Often, a custom tuned data structure will out-perform
> a specialization of a general data structure. In any case, I agree with
> working first on the in-memory version.

I agree on working on the in-memory version too but making it more ubiquitous.

> 
> Patricia

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

Patricia,

Here is an update from one of the authors of the PDF you cited.

Mr. McGrady,
In a paper on lock-free memory management (http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf) I describe a variant of the lock-free algorithm in the PODC 1996, such that the memory used by the nodes of the queue can be freed and reused arbitrarily.
My published work on concurrent algorithms since then is listed here (http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/m/Michael:Maged_M=.html). 
Examples include:
- Idempotent work stealing. PPOPP 2009: 45-54
- Practical Lock-Free and Wait-Free LL/SC/VL Implementations Using 64-Bit CAS. DISC 2004: 144-158
- Scalable lock-free dynamic memory allocation. PLDI 2004: 35-46
- Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects. IEEE Trans. Parallel Distrib. Syst. 15(6): 491-504 (2004)
- High performance dynamic lock-free hash tables and list-based sets. SPAA 2002: 73-82
Please let me know if you have any questions.
Best regards,
Maged

MG
On Dec 14, 2010, at 1:40 PM, Patricia Shanahan wrote:

> On 12/14/2010 8:37 AM, Gregg Wonderly wrote:
>> On 12/14/2010 1:36 AM, MICHAEL MCGRADY wrote:
>>> I would say that in addition to just be a fast data structure the data
>>> structure
>> > must be fast and accommodate synchronous and asynchronous backups,
>> partitions,
>> > and transactions.
>> 
>> This is an important issue from the perspective that there are two
>> scenarios that used to be supported by outrigger. A persistent and an
>> non-persistent version used to exist. The persistent version used PSE
>> for serialization to disk. That was a simple yet powerful mechanism. Due
>> to licensing (Sun paid for a distribution license), it was in a sense,
>> deprecated at the point of River being started.
>> 
>> For those that don't know about PSE, it used a post compilation bytecode
>> manipulator that looked for calls to a "start transaction" method, and
>> then found modification assignments to associated data structures, and
>> modified the byte code to set a "modified bit" on the associated data.
>> When "end transaction" was encountered, it stopped.
>> 
>> I think it would be a good idea to focus on the performance of the in
>> memory (messaging only type of application) version. The persistent
>> version is a completely different animal and requires some fairly
>> advanced features for managing all of the appropriate control points.
>> Making one code path do both can be somewhat challenging from an all out
>> performance perspective.
>> 
> 
> Thanks for the useful background information.
> 
> There is one slim hope I can see for a common code path, but it is a
> very long way off.
> 
> My prejudice, subject to being convinced that another approach would be
> better, would be to try to map a persistent version to a relational
> database through SQL. Relational databases deal with transactions, ACID,
> distribution, and performance issues. There are a lot of options for
> users, more than for OO databases, at all price points starting at free.
> 
> The way outrigger uses its FastList looks rather like a sort of
> simplified relational database, with each FastList instance representing
> a table and selects being done by linear scan of the table.
> 
> If we made a persistent version use a relational database to represent
> the space, we could then experiment with performance run-offs between
> our best shot at an ad-hoc in-memory implementation, and what we get
> from the persistent version if we drop in an in-memory database
> implementation. If they come close, we could drop the ad-hoc
> implementation and focus all effort on the relational database version.
> 
> It is a slim hope. Often, a custom tuned data structure will out-perform
> a specialization of a general data structure. In any case, I agree with
> working first on the in-memory version.
> 
> Patricia

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Gregg Wonderly <gr...@gmail.com>.

On 12/14/2010 3:40 PM, Patricia Shanahan wrote:
> On 12/14/2010 8:37 AM, Gregg Wonderly wrote:
>> On 12/14/2010 1:36 AM, MICHAEL MCGRADY wrote:
>>> I would say that in addition to just be a fast data structure the data
>>> structure
>> > must be fast and accommodate synchronous and asynchronous backups,
>> partitions,
>> > and transactions.
>>
>> This is an important issue from the perspective that there are two
>> scenarios that used to be supported by outrigger. A persistent and an
>> non-persistent version used to exist. The persistent version used PSE
>> for serialization to disk. That was a simple yet powerful mechanism. Due
>> to licensing (Sun paid for a distribution license), it was in a sense,
>> deprecated at the point of River being started.
>>
>> For those that don't know about PSE, it used a post compilation bytecode
>> manipulator that looked for calls to a "start transaction" method, and
>> then found modification assignments to associated data structures, and
>> modified the byte code to set a "modified bit" on the associated data.
>> When "end transaction" was encountered, it stopped.
>>
>> I think it would be a good idea to focus on the performance of the in
>> memory (messaging only type of application) version. The persistent
>> version is a completely different animal and requires some fairly
>> advanced features for managing all of the appropriate control points.
>> Making one code path do both can be somewhat challenging from an all out
>> performance perspective.
>>
>
> Thanks for the useful background information.
>
> There is one slim hope I can see for a common code path, but it is a
> very long way off.
>
> My prejudice, subject to being convinced that another approach would be
> better, would be to try to map a persistent version to a relational
> database through SQL. Relational databases deal with transactions, ACID,
> distribution, and performance issues. There are a lot of options for
> users, more than for OO databases, at all price points starting at free.

The point of javaspaces, is the API, not the implementation.  The details of 
implementation can be wide and varied as has been demonstrated by different 
implementations.   The linking of "leasing", "transactions" and "matching", with 
"time ordered visibility" creates a powerful mechanism.   While an SQL database 
can include these features, they are not there by default, and this causes one 
to need quite a bit of infrastructure to create a working Javaspace based on an 
SQL database.  In the end, I'd rather just write an SQL database application 
which provided these features, rather than use Javaspaces, if I "needed" SQL.

Integrating all of the features is, in my mind, a large task when you have to 
consider the variation of which SQL databases do and do not support transactions 
with row locking granularity.  A Lease on an Entry, needs to work effectively, 
regardless of who has a transactional lock on it, for example.

> The way outrigger uses its FastList looks rather like a sort of
> simplified relational database, with each FastList instance representing
> a table and selects being done by linear scan of the table.
For read/take this is the basics, but I think when you mix in transactions and 
leasing, there is more work going on behind the scenes, associated with all 
active FastList instances.
> If we made a persistent version use a relational database to represent
> the space, we could then experiment with performance run-offs between
> our best shot at an ad-hoc in-memory implementation, and what we get
> from the persistent version if we drop in an in-memory database
> implementation. If they come close, we could drop the ad-hoc
> implementation and focus all effort on the relational database version.
I am the  most worried by the fact that our common interface is JDBC but 
transactions and leasing on top of that will not always be easy to implement 
with all JDBC accessible databases.   Microsoft Access vs Oracle for example has 
numerous variances on what is possible to do with one client vs multiple.
> It is a slim hope. Often, a custom tuned data structure will out-perform
> a specialization of a general data structure. In any case, I agree with
> working first on the in-memory version.
I really appreciate your attention to the details and looking into all of this 
with a fresh set of eyes!

Gregg Wonderly

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/14/2010 8:37 AM, Gregg Wonderly wrote:
> On 12/14/2010 1:36 AM, MICHAEL MCGRADY wrote:
>> I would say that in addition to just be a fast data structure the data
>> structure
>  > must be fast and accommodate synchronous and asynchronous backups,
> partitions,
>  > and transactions.
>
> This is an important issue from the perspective that there are two
> scenarios that used to be supported by outrigger. A persistent and an
> non-persistent version used to exist. The persistent version used PSE
> for serialization to disk. That was a simple yet powerful mechanism. Due
> to licensing (Sun paid for a distribution license), it was in a sense,
> deprecated at the point of River being started.
>
> For those that don't know about PSE, it used a post compilation bytecode
> manipulator that looked for calls to a "start transaction" method, and
> then found modification assignments to associated data structures, and
> modified the byte code to set a "modified bit" on the associated data.
> When "end transaction" was encountered, it stopped.
>
> I think it would be a good idea to focus on the performance of the in
> memory (messaging only type of application) version. The persistent
> version is a completely different animal and requires some fairly
> advanced features for managing all of the appropriate control points.
> Making one code path do both can be somewhat challenging from an all out
> performance perspective.
>

Thanks for the useful background information.

There is one slim hope I can see for a common code path, but it is a
very long way off.

My prejudice, subject to being convinced that another approach would be
better, would be to try to map a persistent version to a relational
database through SQL. Relational databases deal with transactions, ACID,
distribution, and performance issues. There are a lot of options for
users, more than for OO databases, at all price points starting at free.

The way outrigger uses its FastList looks rather like a sort of
simplified relational database, with each FastList instance representing
a table and selects being done by linear scan of the table.

If we made a persistent version use a relational database to represent
the space, we could then experiment with performance run-offs between
our best shot at an ad-hoc in-memory implementation, and what we get
from the persistent version if we drop in an in-memory database
implementation. If they come close, we could drop the ad-hoc
implementation and focus all effort on the relational database version.

It is a slim hope. Often, a custom tuned data structure will out-perform
a specialization of a general data structure. In any case, I agree with
working first on the in-memory version.

Patricia

Re: datastructure classes

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/14/2010 1:36 AM, MICHAEL MCGRADY wrote:
> I would say that in addition to just be a fast data structure the data structure
 > must be fast and accommodate synchronous and asynchronous backups, partitions,
 > and transactions.

This is an important issue from the perspective that there are two scenarios 
that used to be supported by outrigger.  A persistent and an non-persistent 
version used to exist.  The persistent version used PSE for serialization to 
disk.  That was a simple yet powerful mechanism.  Due to licensing (Sun paid for 
a distribution license), it was in a sense, deprecated at the point of River 
being started.

For those that don't know about PSE, it used a post compilation bytecode 
manipulator that looked for calls to a "start transaction" method, and then 
found modification assignments to associated data structures, and modified the 
byte code to set a "modified bit" on the associated data.  When "end 
transaction" was encountered, it stopped.

I think it would be a good idea to focus on the performance of the in memory 
(messaging only type of application) version.  The persistent version is a 
completely different animal and requires some fairly advanced features for 
managing all of the appropriate control points.  Making one code path do both 
can be somewhat challenging from an all out performance perspective.

Gregg

> On Dec 13, 2010, at 5:15 PM, Patricia Shanahan wrote:
>
>> You make a good point that I should take a look around to see if I can find anything suitable that already exists, with appropriate licensing.
>>
>> That said, I believe the requirements for an outrigger FastList are:
>>
>> 1. Highly scalable performance. In particular, iterating over the list should be fast even when many threads are doing it. As far as I can tell, any JavaSpace lookup turns into iterating over a FastList representing items of acceptable classes, to see if any of the nodes match the other template requirements.
>>
>> 2. A limited set of features: Iterate over the list starting at the head, remove a node, add at tail.
>>
>> The two libraries each go in a different direction.
>>
>> Javolution is primarily a single thread speed play, with fast and predictable performance. It is what I would want if I were writing real time code. It does have some support for concurrency, but not what is needed for outrigger: "Fast collections (or maps) can be made thread-safe by marking them FastCollection#shared shared Having a shared collection (or map) means that it can be safely used without synchronization. It does not mean that a change made by a thread is automatically viewed by another thread (that would require synchronizing)."
>>
>> The Apache Commons collections are very feature rich, but seem to be primarily single thread, with the synchronizing decorator approach to concurrency.
>>
>> I will do some more web searching, but my current strongest contender for an existing class is java.util.concurrent.ConcurrentLinkedQueue. It's main limitation is slow remove, but that might be worked around by using a pair of queues, and periodically cleaning by copying from one queue to another. Also, java.util.concurrent contains a reference to an important paper, http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting point for finding the latest research on these issues.
>>
>> Patricia
>>
>>
>>
>>
>> On 12/13/2010 4:30 PM, James Grahn wrote:
>>> Because there've been a few concerns about memory models and concurrent
>>> data structures, I thought I'd suggest a couple of resources. Library
>>> usage is preferable to reinventing the wheel, where possible:
>>>
>>> First:
>>> http://javolution.org/
>>>
>>> I haven't used it directly; I've only used a library which used it. But
>>> it perhaps deserves a scan to see if it would meet our requirements (and
>>> testing to ensure it's acceptable if it claims to).
>>>
>>> Also, it's BSD-licensed. I believe that's acceptable?
>>>
>>> Second:
>>> http://commons.apache.org/collections/index.html
>>>
>>> I don't *think* it has what we need, but it's worth poking around in
>>> before creating a new datastructure. Has the advantage of being part of
>>> the Apache family.
>>>
>>> jamesG
>>>
>>> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>>>> Gregg Wonderly wrote:
>>>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>>>
>>>>>> I am not sure that I have time to really look over this code. I
>>>>>> wonder if anyone
>>>>>> knows if this is relatively new code that John put together as part
>>>>>> of the
>>>>>> effort to remove the use of PSE from outrigger, or is the original
>>>>>> "non-persistent" javaspaces implementation?
>>>>>>
>>>>>> Perhaps we need to do something different here, a segmented list for
>>>>>> example,
>>>>>> which is what PSE did with it's Vector implementation so that
>>>>>> segments of the
>>>>>> list could be locked independently, as well as allowing the segments
>>>>>> to be
>>>>>> "deleted" from disk once they were "empty".
>>>>>
>>>>> And, of course if we pull out outrigger, as an application/service,
>>>>> separate from the Jini part of river, we could just say that Outrigger
>>>>> requires JDK1.5 so that we can move to a different concurrency
>>>>> implementation if that is needed, using the new memory model.
>>>>
>>>> It seems like a lot of work to fix a package private implementation,
>>>> already based on flawed assumptions (Patricia's done a great job
>>>> debugging this, she's a real asset, I look forward to learning more
>>>> debugging tips). I suspect we'll get a much better result starting from
>>>> scratch, utilising Java 6.
>>>>
>>>> Since River is a Jini platform, why don't we start by creating an
>>>> independent implementation of outrigger utilising any latest available
>>>> java features. Not only will this produce a better implementation, that
>>>> will be easier to support, but it might improve our understanding of
>>>> what's required for a modular build as well.
>>>>
>>>> The existing outrigger implementation can remain as it is, but
>>>> deprecated, left in place for legacy support.
>>>>
>>>> I've got some concurrent utilities in pepe you may pinch / improve if
>>>> you like:
>>>>
>>>> org.apache.river.impl.util.*
>>>>
>>>> ConcurrentCollections
>>>> ConcurrentSoftMap
>>>> ConcurrentWeakIdentityMap
>>>> ConcurrentWeakMap
>>>>
>>>> ConcurrentCollections is a multi read single write lock based collection
>>>> wrapper. It defensively copies for Iterators but still allows performing
>>>> removals from the underlying collection.
>>>>
>>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>>> ReferenceQueue to remove stale entries prior to every method call.
>>>> Cheers,
>>>>
>>>> Peter.
>>>>
>>>>>
>>>>> One of the things I did in griddle, was define a RemoteIterator which
>>>>> allows a "get" operation to have a return value before anything is in
>>>>> the space. A server side thread, then looks through the space for
>>>>> matches and adds them to the return queue for the calling "get". This
>>>>> allows server side contention to be managed to some degree because the
>>>>> number of "searching" threads could be held to an appropriate minimum
>>>>> (even one). The javaspaces API doesn't disallow such proxy
>>>>> implementation and JavaSpaces05 iterator support starts to expose this
>>>>> kind of thing more literally.
>>>>>
>>>>> Ultimately, I think a segmented list that looks like a
>>>>> List<List<Entry>>  would be a way to distribute locking for "get"
>>>>> because iteration would be less likely to be on the same segment at
>>>>> the same time.
>>>>>
>>>>> Insertion at the tail, is always a contentious issue for concurrent
>>>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>>>> adds until it gets to a particular size, and then turn its contents
>>>>> into a List and add that list to the tail of the List<List<Entry>>.
>>>>> You can choose then to decide when to do that movement by watching for
>>>>> the other lists to be empty too, or when a traversal gets to the end
>>>>> of what is in a list already.
>>>>>
>>>>> This keeps writers quite free to insert quickly and completely away
>>>>> from the readers as well. Ordering presents most of the opportunities
>>>>> for big time contention. Unordered (map), or more segmented (map of
>>>>> list) construction relieves some of the contention if course, by
>>>>> distributing it.
>>>>>
>>>>> Gregg
>>>>>
>>>>>> Gregg
>>>>>>
>>>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>>>> Patricia Shanahan wrote:
>>>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>>>> ...
>>>>>>>>>> The important issue in FastList is that it was written with the
>>>>>>>>> JDK1.4
>>>>>>>>>> memory model. After moving River to Java 1.5, we'd have the
>>>>>>>>> JSR166 work
>>>>>>>>>> and the new, consistent memory model where volatile has a true
>>>>>>>>> meaning.
>>>>>>>>>> However, this code in particular is quite complex as you have
>>>>>>>>> noted, so
>>>>>>>>>> even adjusting to the new memory model could be problematic.
>>>>>>>>>
>>>>>>>>> I've just run a modified, simplified version of my test with java
>>>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>>>>> get
>>>>>>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>>>>>>> working from the assumption that the issue was to do with the
>>>>>>>>> changes
>>>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>>>>> model.
>>>>>>>>>
>>>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>>>> who
>>>>>>>>> would like to help, please reply. I would welcome another set of
>>>>>>>>> eyes
>>>>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>>>>> FastList is supposed to work. There seems to be a critical invariant
>>>>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>>>>> NullPointerException or dropped items.
>>>>>>>>>
>>>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>>>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>>>>>>> mixture of threads that repeatedly add items to a FastList and
>>>>>>>>> threads
>>>>>>>>> that repeatedly remove the first item they can from the FastList.
>>>>>>>>> Failures seem to require simultaneous adds and removes.
>>>>>>>>>
>>>>>>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>>>>>>> rather complicated, code and switch to writing a concurrent high
>>>>>>>>> performance FastList substitute for 1.5 or later.
>>>>>>>>>
>>>>>>>>> Patricia
>>>>>>>>>
>>>>>>>> I'll have a look tonight, no promises though ;)
>>>>>>>
>>>>>>> I'm attaching the simplified test application main program that can
>>>>>>> run, and
>>>>>>> fail, under JRE 1.4, with no need for Junit.
>>>>>>>
>>>>>>> Here's what I think I know. First of all, I have found some dubious
>>>>>>> synchronization situations. However, fixing all the things I have so
>>>>>>> far found
>>>>>>> of that type has only reduced the failure rate, not eliminated
>>>>>>> failures. That
>>>>>>> could be caused by changing timings without having any impact on the
>>>>>>> root cause.
>>>>>>>
>>>>>>> The key invariant relates to a thread that is doing a scan, starting
>>>>>>> with a call
>>>>>>> to head() and proceeding through a series of calls to next() to
>>>>>>> examine nodes.
>>>>>>> The head() call sets up a guard node for the thread that was the
>>>>>>> tail at some
>>>>>>> point during the head call. The invariant is that the series of
>>>>>>> next() calls
>>>>>>> will reach the guard node before finding a null next pointer,
>>>>>>> indicating the
>>>>>>> actual tail.
>>>>>>>
>>>>>>> The remove call does not really remove anything, it merely marks the
>>>>>>> node
>>>>>>> removed. Removed nodes are unlinked as a side effect of scans,
>>>>>>> during head and
>>>>>>> next calls, but only if they are not guard nodes.
>>>>>>>
>>>>>>> There are additional complications in the restart and reap methods,
>>>>>>> but we can
>>>>>>> ignore them for now - my test does not use them.
>>>>>>>
>>>>>>> Once a guard node is lost, the synchronization breaks down
>>>>>>> completely, because
>>>>>>> e.g. insertion at tail is protected by synchronization on the
>>>>>>> FastList instance,
>>>>>>> but unlinking of a removed node in the middle is protected, to the
>>>>>>> extent it is
>>>>>>> protected at all, by synchronization on the FastList.Node instance
>>>>>>> that is being
>>>>>>> removed.
>>>>>>>
>>>>>>> The commonest failure symptom is a scan reaching the null next
>>>>>>> pointer at the
>>>>>>> end of the FIFO during head(), without first finding the guard node
>>>>>>> it just set
>>>>>>> up. An alternative form of failure is loss of some entries - they
>>>>>>> get added, but
>>>>>>> the remove threads never find them. The second symptom is
>>>>>>> predominant in the
>>>>>>> JavaSpaces stress test that got me started on this. Messing up a
>>>>>>> next pointer
>>>>>>> could cause either.
>>>>>>>
>>>>>>> Incidentally, I'm curious about why it has such a fragile system in
>>>>>>> which the
>>>>>>> state of a scan is partly tracked by thread, when it seems like an
>>>>>>> obvious
>>>>>>> candidate for the Iterator pattern. Callers do need to be able to
>>>>>>> find out if a
>>>>>>> remove call succeeded or not (the node may have been removed by
>>>>>>> another thread),
>>>>>>> but that could be done in an interface extending Iterator. The
>>>>>>> WeakHashMap in a
>>>>>>> node that keeps track of the threads for which it is a guard would
>>>>>>> instead track
>>>>>>> the Iterator. There would be no need for thread local storage, the
>>>>>>> same data
>>>>>>> could be kept in the Iterator.
>>>>>>>
>>>>>>> Thanks for any time you can spend looking at this.
>>>>>>>
>>>>>>> Patricia
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
> Michael McGrady
> Chief Architect
> Topia Technology, Inc.
> Cel 1.253.720.3365
> Work 1.253.572.9712 extension 2037
> mmcgrady@topiatechnology.com
>
>
>
>
>
>

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

For my part the "for the ages" implementation of this function is required.  I am thinking about Brewer's Theorem and other related "omitted" or non-functional values or quality control criteria (QCC) in related to this functionality.  You have chosen your target well, I believe.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 14, 2010, at 12:18 AM, Patricia Shanahan <pa...@acm.org> wrote:

> On 12/13/2010 11:36 PM, MICHAEL MCGRADY wrote:
>> I would say that in addition to just be a fast data structure the
>> data structure must be fast and accommodate synchronous and
>> asynchronous backups, partitions, and transactions.
> 
> How much of that should be in the bottom layer? As far as I can tell,
> the general design of outrigger is to implement features such as
> transactions on top of FastList, treating it as a fast, simple
> primitive. Remember I'm not trying to create the list data structure for
> the ages, just something that I can make work and that does the same job
> as FastList.
> 
> Patricia

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/13/2010 11:36 PM, MICHAEL MCGRADY wrote:
> I would say that in addition to just be a fast data structure the
> data structure must be fast and accommodate synchronous and
> asynchronous backups, partitions, and transactions.

How much of that should be in the bottom layer? As far as I can tell,
the general design of outrigger is to implement features such as
transactions on top of FastList, treating it as a fast, simple
primitive. Remember I'm not trying to create the list data structure for
the ages, just something that I can make work and that does the same job
as FastList.

Patricia

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

I would say that in addition to just be a fast data structure the data structure must be fast and accommodate synchronous and asynchronous backups, partitions, and transactions.

On Dec 13, 2010, at 5:15 PM, Patricia Shanahan wrote:

> You make a good point that I should take a look around to see if I can find anything suitable that already exists, with appropriate licensing.
> 
> That said, I believe the requirements for an outrigger FastList are:
> 
> 1. Highly scalable performance. In particular, iterating over the list should be fast even when many threads are doing it. As far as I can tell, any JavaSpace lookup turns into iterating over a FastList representing items of acceptable classes, to see if any of the nodes match the other template requirements.
> 
> 2. A limited set of features: Iterate over the list starting at the head, remove a node, add at tail.
> 
> The two libraries each go in a different direction.
> 
> Javolution is primarily a single thread speed play, with fast and predictable performance. It is what I would want if I were writing real time code. It does have some support for concurrency, but not what is needed for outrigger: "Fast collections (or maps) can be made thread-safe by marking them FastCollection#shared shared Having a shared collection (or map) means that it can be safely used without synchronization. It does not mean that a change made by a thread is automatically viewed by another thread (that would require synchronizing)."
> 
> The Apache Commons collections are very feature rich, but seem to be primarily single thread, with the synchronizing decorator approach to concurrency.
> 
> I will do some more web searching, but my current strongest contender for an existing class is java.util.concurrent.ConcurrentLinkedQueue. It's main limitation is slow remove, but that might be worked around by using a pair of queues, and periodically cleaning by copying from one queue to another. Also, java.util.concurrent contains a reference to an important paper, http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting point for finding the latest research on these issues.
> 
> Patricia
> 
> 
> 
> 
> On 12/13/2010 4:30 PM, James Grahn wrote:
>> Because there've been a few concerns about memory models and concurrent
>> data structures, I thought I'd suggest a couple of resources. Library
>> usage is preferable to reinventing the wheel, where possible:
>> 
>> First:
>> http://javolution.org/
>> 
>> I haven't used it directly; I've only used a library which used it. But
>> it perhaps deserves a scan to see if it would meet our requirements (and
>> testing to ensure it's acceptable if it claims to).
>> 
>> Also, it's BSD-licensed. I believe that's acceptable?
>> 
>> Second:
>> http://commons.apache.org/collections/index.html
>> 
>> I don't *think* it has what we need, but it's worth poking around in
>> before creating a new datastructure. Has the advantage of being part of
>> the Apache family.
>> 
>> jamesG
>> 
>> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>>> Gregg Wonderly wrote:
>>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>> 
>>>>> I am not sure that I have time to really look over this code. I
>>>>> wonder if anyone
>>>>> knows if this is relatively new code that John put together as part
>>>>> of the
>>>>> effort to remove the use of PSE from outrigger, or is the original
>>>>> "non-persistent" javaspaces implementation?
>>>>> 
>>>>> Perhaps we need to do something different here, a segmented list for
>>>>> example,
>>>>> which is what PSE did with it's Vector implementation so that
>>>>> segments of the
>>>>> list could be locked independently, as well as allowing the segments
>>>>> to be
>>>>> "deleted" from disk once they were "empty".
>>>> 
>>>> And, of course if we pull out outrigger, as an application/service,
>>>> separate from the Jini part of river, we could just say that Outrigger
>>>> requires JDK1.5 so that we can move to a different concurrency
>>>> implementation if that is needed, using the new memory model.
>>> 
>>> It seems like a lot of work to fix a package private implementation,
>>> already based on flawed assumptions (Patricia's done a great job
>>> debugging this, she's a real asset, I look forward to learning more
>>> debugging tips). I suspect we'll get a much better result starting from
>>> scratch, utilising Java 6.
>>> 
>>> Since River is a Jini platform, why don't we start by creating an
>>> independent implementation of outrigger utilising any latest available
>>> java features. Not only will this produce a better implementation, that
>>> will be easier to support, but it might improve our understanding of
>>> what's required for a modular build as well.
>>> 
>>> The existing outrigger implementation can remain as it is, but
>>> deprecated, left in place for legacy support.
>>> 
>>> I've got some concurrent utilities in pepe you may pinch / improve if
>>> you like:
>>> 
>>> org.apache.river.impl.util.*
>>> 
>>> ConcurrentCollections
>>> ConcurrentSoftMap
>>> ConcurrentWeakIdentityMap
>>> ConcurrentWeakMap
>>> 
>>> ConcurrentCollections is a multi read single write lock based collection
>>> wrapper. It defensively copies for Iterators but still allows performing
>>> removals from the underlying collection.
>>> 
>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>> ReferenceQueue to remove stale entries prior to every method call.
>>> Cheers,
>>> 
>>> Peter.
>>> 
>>>> 
>>>> One of the things I did in griddle, was define a RemoteIterator which
>>>> allows a "get" operation to have a return value before anything is in
>>>> the space. A server side thread, then looks through the space for
>>>> matches and adds them to the return queue for the calling "get". This
>>>> allows server side contention to be managed to some degree because the
>>>> number of "searching" threads could be held to an appropriate minimum
>>>> (even one). The javaspaces API doesn't disallow such proxy
>>>> implementation and JavaSpaces05 iterator support starts to expose this
>>>> kind of thing more literally.
>>>> 
>>>> Ultimately, I think a segmented list that looks like a
>>>> List<List<Entry>> would be a way to distribute locking for "get"
>>>> because iteration would be less likely to be on the same segment at
>>>> the same time.
>>>> 
>>>> Insertion at the tail, is always a contentious issue for concurrent
>>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>>> adds until it gets to a particular size, and then turn its contents
>>>> into a List and add that list to the tail of the List<List<Entry>>.
>>>> You can choose then to decide when to do that movement by watching for
>>>> the other lists to be empty too, or when a traversal gets to the end
>>>> of what is in a list already.
>>>> 
>>>> This keeps writers quite free to insert quickly and completely away
>>>> from the readers as well. Ordering presents most of the opportunities
>>>> for big time contention. Unordered (map), or more segmented (map of
>>>> list) construction relieves some of the contention if course, by
>>>> distributing it.
>>>> 
>>>> Gregg
>>>> 
>>>>> Gregg
>>>>> 
>>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>>> Patricia Shanahan wrote:
>>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>>> ...
>>>>>>>> > The important issue in FastList is that it was written with the
>>>>>>>> JDK1.4
>>>>>>>> > memory model. After moving River to Java 1.5, we'd have the
>>>>>>>> JSR166 work
>>>>>>>> > and the new, consistent memory model where volatile has a true
>>>>>>>> meaning.
>>>>>>>> > However, this code in particular is quite complex as you have
>>>>>>>> noted, so
>>>>>>>> > even adjusting to the new memory model could be problematic.
>>>>>>>> 
>>>>>>>> I've just run a modified, simplified version of my test with java
>>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>>>> get
>>>>>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>>>>>> working from the assumption that the issue was to do with the
>>>>>>>> changes
>>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>>>> model.
>>>>>>>> 
>>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>>> who
>>>>>>>> would like to help, please reply. I would welcome another set of
>>>>>>>> eyes
>>>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>>>> FastList is supposed to work. There seems to be a critical invariant
>>>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>>>> NullPointerException or dropped items.
>>>>>>>> 
>>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>>>>>> mixture of threads that repeatedly add items to a FastList and
>>>>>>>> threads
>>>>>>>> that repeatedly remove the first item they can from the FastList.
>>>>>>>> Failures seem to require simultaneous adds and removes.
>>>>>>>> 
>>>>>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>>>>>> rather complicated, code and switch to writing a concurrent high
>>>>>>>> performance FastList substitute for 1.5 or later.
>>>>>>>> 
>>>>>>>> Patricia
>>>>>>>> 
>>>>>>> I'll have a look tonight, no promises though ;)
>>>>>> 
>>>>>> I'm attaching the simplified test application main program that can
>>>>>> run, and
>>>>>> fail, under JRE 1.4, with no need for Junit.
>>>>>> 
>>>>>> Here's what I think I know. First of all, I have found some dubious
>>>>>> synchronization situations. However, fixing all the things I have so
>>>>>> far found
>>>>>> of that type has only reduced the failure rate, not eliminated
>>>>>> failures. That
>>>>>> could be caused by changing timings without having any impact on the
>>>>>> root cause.
>>>>>> 
>>>>>> The key invariant relates to a thread that is doing a scan, starting
>>>>>> with a call
>>>>>> to head() and proceeding through a series of calls to next() to
>>>>>> examine nodes.
>>>>>> The head() call sets up a guard node for the thread that was the
>>>>>> tail at some
>>>>>> point during the head call. The invariant is that the series of
>>>>>> next() calls
>>>>>> will reach the guard node before finding a null next pointer,
>>>>>> indicating the
>>>>>> actual tail.
>>>>>> 
>>>>>> The remove call does not really remove anything, it merely marks the
>>>>>> node
>>>>>> removed. Removed nodes are unlinked as a side effect of scans,
>>>>>> during head and
>>>>>> next calls, but only if they are not guard nodes.
>>>>>> 
>>>>>> There are additional complications in the restart and reap methods,
>>>>>> but we can
>>>>>> ignore them for now - my test does not use them.
>>>>>> 
>>>>>> Once a guard node is lost, the synchronization breaks down
>>>>>> completely, because
>>>>>> e.g. insertion at tail is protected by synchronization on the
>>>>>> FastList instance,
>>>>>> but unlinking of a removed node in the middle is protected, to the
>>>>>> extent it is
>>>>>> protected at all, by synchronization on the FastList.Node instance
>>>>>> that is being
>>>>>> removed.
>>>>>> 
>>>>>> The commonest failure symptom is a scan reaching the null next
>>>>>> pointer at the
>>>>>> end of the FIFO during head(), without first finding the guard node
>>>>>> it just set
>>>>>> up. An alternative form of failure is loss of some entries - they
>>>>>> get added, but
>>>>>> the remove threads never find them. The second symptom is
>>>>>> predominant in the
>>>>>> JavaSpaces stress test that got me started on this. Messing up a
>>>>>> next pointer
>>>>>> could cause either.
>>>>>> 
>>>>>> Incidentally, I'm curious about why it has such a fragile system in
>>>>>> which the
>>>>>> state of a scan is partly tracked by thread, when it seems like an
>>>>>> obvious
>>>>>> candidate for the Iterator pattern. Callers do need to be able to
>>>>>> find out if a
>>>>>> remove call succeeded or not (the node may have been removed by
>>>>>> another thread),
>>>>>> but that could be done in an interface extending Iterator. The
>>>>>> WeakHashMap in a
>>>>>> node that keeps track of the threads for which it is a guard would
>>>>>> instead track
>>>>>> the Iterator. There would be no need for thread local storage, the
>>>>>> same data
>>>>>> could be kept in the Iterator.
>>>>>> 
>>>>>> Thanks for any time you can spend looking at this.
>>>>>> 
>>>>>> Patricia
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

Nice!  If you would like me to look at some sources, please do.

MG

On Dec 13, 2010, at 5:15 PM, Patricia Shanahan wrote:

> You make a good point that I should take a look around to see if I can find anything suitable that already exists, with appropriate licensing.
> 
> That said, I believe the requirements for an outrigger FastList are:
> 
> 1. Highly scalable performance. In particular, iterating over the list should be fast even when many threads are doing it. As far as I can tell, any JavaSpace lookup turns into iterating over a FastList representing items of acceptable classes, to see if any of the nodes match the other template requirements.
> 
> 2. A limited set of features: Iterate over the list starting at the head, remove a node, add at tail.
> 
> The two libraries each go in a different direction.
> 
> Javolution is primarily a single thread speed play, with fast and predictable performance. It is what I would want if I were writing real time code. It does have some support for concurrency, but not what is needed for outrigger: "Fast collections (or maps) can be made thread-safe by marking them FastCollection#shared shared Having a shared collection (or map) means that it can be safely used without synchronization. It does not mean that a change made by a thread is automatically viewed by another thread (that would require synchronizing)."
> 
> The Apache Commons collections are very feature rich, but seem to be primarily single thread, with the synchronizing decorator approach to concurrency.
> 
> I will do some more web searching, but my current strongest contender for an existing class is java.util.concurrent.ConcurrentLinkedQueue. It's main limitation is slow remove, but that might be worked around by using a pair of queues, and periodically cleaning by copying from one queue to another. Also, java.util.concurrent contains a reference to an important paper, http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting point for finding the latest research on these issues.
> 
> Patricia
> 
> 
> 
> 
> On 12/13/2010 4:30 PM, James Grahn wrote:
>> Because there've been a few concerns about memory models and concurrent
>> data structures, I thought I'd suggest a couple of resources. Library
>> usage is preferable to reinventing the wheel, where possible:
>> 
>> First:
>> http://javolution.org/
>> 
>> I haven't used it directly; I've only used a library which used it. But
>> it perhaps deserves a scan to see if it would meet our requirements (and
>> testing to ensure it's acceptable if it claims to).
>> 
>> Also, it's BSD-licensed. I believe that's acceptable?
>> 
>> Second:
>> http://commons.apache.org/collections/index.html
>> 
>> I don't *think* it has what we need, but it's worth poking around in
>> before creating a new datastructure. Has the advantage of being part of
>> the Apache family.
>> 
>> jamesG
>> 
>> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>>> Gregg Wonderly wrote:
>>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>> 
>>>>> I am not sure that I have time to really look over this code. I
>>>>> wonder if anyone
>>>>> knows if this is relatively new code that John put together as part
>>>>> of the
>>>>> effort to remove the use of PSE from outrigger, or is the original
>>>>> "non-persistent" javaspaces implementation?
>>>>> 
>>>>> Perhaps we need to do something different here, a segmented list for
>>>>> example,
>>>>> which is what PSE did with it's Vector implementation so that
>>>>> segments of the
>>>>> list could be locked independently, as well as allowing the segments
>>>>> to be
>>>>> "deleted" from disk once they were "empty".
>>>> 
>>>> And, of course if we pull out outrigger, as an application/service,
>>>> separate from the Jini part of river, we could just say that Outrigger
>>>> requires JDK1.5 so that we can move to a different concurrency
>>>> implementation if that is needed, using the new memory model.
>>> 
>>> It seems like a lot of work to fix a package private implementation,
>>> already based on flawed assumptions (Patricia's done a great job
>>> debugging this, she's a real asset, I look forward to learning more
>>> debugging tips). I suspect we'll get a much better result starting from
>>> scratch, utilising Java 6.
>>> 
>>> Since River is a Jini platform, why don't we start by creating an
>>> independent implementation of outrigger utilising any latest available
>>> java features. Not only will this produce a better implementation, that
>>> will be easier to support, but it might improve our understanding of
>>> what's required for a modular build as well.
>>> 
>>> The existing outrigger implementation can remain as it is, but
>>> deprecated, left in place for legacy support.
>>> 
>>> I've got some concurrent utilities in pepe you may pinch / improve if
>>> you like:
>>> 
>>> org.apache.river.impl.util.*
>>> 
>>> ConcurrentCollections
>>> ConcurrentSoftMap
>>> ConcurrentWeakIdentityMap
>>> ConcurrentWeakMap
>>> 
>>> ConcurrentCollections is a multi read single write lock based collection
>>> wrapper. It defensively copies for Iterators but still allows performing
>>> removals from the underlying collection.
>>> 
>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>> ReferenceQueue to remove stale entries prior to every method call.
>>> Cheers,
>>> 
>>> Peter.
>>> 
>>>> 
>>>> One of the things I did in griddle, was define a RemoteIterator which
>>>> allows a "get" operation to have a return value before anything is in
>>>> the space. A server side thread, then looks through the space for
>>>> matches and adds them to the return queue for the calling "get". This
>>>> allows server side contention to be managed to some degree because the
>>>> number of "searching" threads could be held to an appropriate minimum
>>>> (even one). The javaspaces API doesn't disallow such proxy
>>>> implementation and JavaSpaces05 iterator support starts to expose this
>>>> kind of thing more literally.
>>>> 
>>>> Ultimately, I think a segmented list that looks like a
>>>> List<List<Entry>> would be a way to distribute locking for "get"
>>>> because iteration would be less likely to be on the same segment at
>>>> the same time.
>>>> 
>>>> Insertion at the tail, is always a contentious issue for concurrent
>>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>>> adds until it gets to a particular size, and then turn its contents
>>>> into a List and add that list to the tail of the List<List<Entry>>.
>>>> You can choose then to decide when to do that movement by watching for
>>>> the other lists to be empty too, or when a traversal gets to the end
>>>> of what is in a list already.
>>>> 
>>>> This keeps writers quite free to insert quickly and completely away
>>>> from the readers as well. Ordering presents most of the opportunities
>>>> for big time contention. Unordered (map), or more segmented (map of
>>>> list) construction relieves some of the contention if course, by
>>>> distributing it.
>>>> 
>>>> Gregg
>>>> 
>>>>> Gregg
>>>>> 
>>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>>> Patricia Shanahan wrote:
>>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>>> ...
>>>>>>>> > The important issue in FastList is that it was written with the
>>>>>>>> JDK1.4
>>>>>>>> > memory model. After moving River to Java 1.5, we'd have the
>>>>>>>> JSR166 work
>>>>>>>> > and the new, consistent memory model where volatile has a true
>>>>>>>> meaning.
>>>>>>>> > However, this code in particular is quite complex as you have
>>>>>>>> noted, so
>>>>>>>> > even adjusting to the new memory model could be problematic.
>>>>>>>> 
>>>>>>>> I've just run a modified, simplified version of my test with java
>>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>>>> get
>>>>>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>>>>>> working from the assumption that the issue was to do with the
>>>>>>>> changes
>>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>>>> model.
>>>>>>>> 
>>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>>> who
>>>>>>>> would like to help, please reply. I would welcome another set of
>>>>>>>> eyes
>>>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>>>> FastList is supposed to work. There seems to be a critical invariant
>>>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>>>> NullPointerException or dropped items.
>>>>>>>> 
>>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>>>>>> mixture of threads that repeatedly add items to a FastList and
>>>>>>>> threads
>>>>>>>> that repeatedly remove the first item they can from the FastList.
>>>>>>>> Failures seem to require simultaneous adds and removes.
>>>>>>>> 
>>>>>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>>>>>> rather complicated, code and switch to writing a concurrent high
>>>>>>>> performance FastList substitute for 1.5 or later.
>>>>>>>> 
>>>>>>>> Patricia
>>>>>>>> 
>>>>>>> I'll have a look tonight, no promises though ;)
>>>>>> 
>>>>>> I'm attaching the simplified test application main program that can
>>>>>> run, and
>>>>>> fail, under JRE 1.4, with no need for Junit.
>>>>>> 
>>>>>> Here's what I think I know. First of all, I have found some dubious
>>>>>> synchronization situations. However, fixing all the things I have so
>>>>>> far found
>>>>>> of that type has only reduced the failure rate, not eliminated
>>>>>> failures. That
>>>>>> could be caused by changing timings without having any impact on the
>>>>>> root cause.
>>>>>> 
>>>>>> The key invariant relates to a thread that is doing a scan, starting
>>>>>> with a call
>>>>>> to head() and proceeding through a series of calls to next() to
>>>>>> examine nodes.
>>>>>> The head() call sets up a guard node for the thread that was the
>>>>>> tail at some
>>>>>> point during the head call. The invariant is that the series of
>>>>>> next() calls
>>>>>> will reach the guard node before finding a null next pointer,
>>>>>> indicating the
>>>>>> actual tail.
>>>>>> 
>>>>>> The remove call does not really remove anything, it merely marks the
>>>>>> node
>>>>>> removed. Removed nodes are unlinked as a side effect of scans,
>>>>>> during head and
>>>>>> next calls, but only if they are not guard nodes.
>>>>>> 
>>>>>> There are additional complications in the restart and reap methods,
>>>>>> but we can
>>>>>> ignore them for now - my test does not use them.
>>>>>> 
>>>>>> Once a guard node is lost, the synchronization breaks down
>>>>>> completely, because
>>>>>> e.g. insertion at tail is protected by synchronization on the
>>>>>> FastList instance,
>>>>>> but unlinking of a removed node in the middle is protected, to the
>>>>>> extent it is
>>>>>> protected at all, by synchronization on the FastList.Node instance
>>>>>> that is being
>>>>>> removed.
>>>>>> 
>>>>>> The commonest failure symptom is a scan reaching the null next
>>>>>> pointer at the
>>>>>> end of the FIFO during head(), without first finding the guard node
>>>>>> it just set
>>>>>> up. An alternative form of failure is loss of some entries - they
>>>>>> get added, but
>>>>>> the remove threads never find them. The second symptom is
>>>>>> predominant in the
>>>>>> JavaSpaces stress test that got me started on this. Messing up a
>>>>>> next pointer
>>>>>> could cause either.
>>>>>> 
>>>>>> Incidentally, I'm curious about why it has such a fragile system in
>>>>>> which the
>>>>>> state of a scan is partly tracked by thread, when it seems like an
>>>>>> obvious
>>>>>> candidate for the Iterator pattern. Callers do need to be able to
>>>>>> find out if a
>>>>>> remove call succeeded or not (the node may have been removed by
>>>>>> another thread),
>>>>>> but that could be done in an interface extending Iterator. The
>>>>>> WeakHashMap in a
>>>>>> node that keeps track of the threads for which it is a guard would
>>>>>> instead track
>>>>>> the Iterator. There would be no need for thread local storage, the
>>>>>> same data
>>>>>> could be kept in the Iterator.
>>>>>> 
>>>>>> Thanks for any time you can spend looking at this.
>>>>>> 
>>>>>> Patricia
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

++ this is so important.  The days of read mostly are gone.  

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 14, 2010, at 7:25 AM, Patricia Shanahan <pa...@acm.org> wrote:

> Carfield Yim wrote:
>> Hi, copyonwritearraylist is suitable for this case imho. But it don't sound
>> like an option to you. Would you share a bit more. I am interest to learn
>> about it.
> 
> My first feeling was that there is at least one workload that seems
> plausible to me for which it would have poor performance.
> 
> The workload is one in which most transactions are either a write or a
> read-and-take. For example, consider a space that is the task pool for a
> distributed workload. The writers are discovering tasks that need doing,
> and recording them in the space. Each reader looks for a task it can do,
> takes it from the space, and performs it. Now suppose the space is
> buffering between bursty task creation and a fixed service rate.
> 
> At times, the queue will get long, because of a burst of new tasks being
> added. Once that happens, copy-on-write becomes expensive.
> 
> Could people with more JavaSpaces experience comment on the plausibility
> of this pattern?
> 
> I believe JavaSpace needs to be treated as an infrastructure that needs
> robust performance across a wide range of workloads, not just
> optimization to make one workload run fast. However, I am tempted to
> implement a CopyOnWriteArrayList version, and say that any other
> implementation must beat it on at least one benchmark.
> 
> Patricia

Re: datastructure classes

Posted by Carfield Yim <ca...@carfield.com.hk>.

Understand, however scala also copy-on-write most of the time because
list is immutable -
http://www.artima.com/scalazine/articles/steps.html , I guess this may
not as expensive as it seem?

I don't really "selling" this api, however, I feel it would be
acceptable to give it a try, and see how it perform?

On Tue, Dec 14, 2010 at 11:25 PM, Patricia Shanahan <pa...@acm.org> wrote:
> Carfield Yim wrote:
>>
>> Hi, copyonwritearraylist is suitable for this case imho. But it don't
>> sound
>> like an option to you. Would you share a bit more. I am interest to learn
>> about it.
>
> My first feeling was that there is at least one workload that seems
> plausible to me for which it would have poor performance.
>
> The workload is one in which most transactions are either a write or a
> read-and-take. For example, consider a space that is the task pool for a
> distributed workload. The writers are discovering tasks that need doing,
> and recording them in the space. Each reader looks for a task it can do,
> takes it from the space, and performs it. Now suppose the space is
> buffering between bursty task creation and a fixed service rate.
>
> At times, the queue will get long, because of a burst of new tasks being
> added. Once that happens, copy-on-write becomes expensive.
>
> Could people with more JavaSpaces experience comment on the plausibility
> of this pattern?
>
> I believe JavaSpace needs to be treated as an infrastructure that needs
> robust performance across a wide range of workloads, not just
> optimization to make one workload run fast. However, I am tempted to
> implement a CopyOnWriteArrayList version, and say that any other
> implementation must beat it on at least one benchmark.
>
> Patricia
>

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

Carfield Yim wrote:
> Hi, copyonwritearraylist is suitable for this case imho. But it don't sound
> like an option to you. Would you share a bit more. I am interest to learn
> about it.

My first feeling was that there is at least one workload that seems
plausible to me for which it would have poor performance.

The workload is one in which most transactions are either a write or a
read-and-take. For example, consider a space that is the task pool for a
distributed workload. The writers are discovering tasks that need doing,
and recording them in the space. Each reader looks for a task it can do,
takes it from the space, and performs it. Now suppose the space is
buffering between bursty task creation and a fixed service rate.

At times, the queue will get long, because of a burst of new tasks being
added. Once that happens, copy-on-write becomes expensive.

Could people with more JavaSpaces experience comment on the plausibility
of this pattern?

I believe JavaSpace needs to be treated as an infrastructure that needs
robust performance across a wide range of workloads, not just
optimization to make one workload run fast. However, I am tempted to
implement a CopyOnWriteArrayList version, and say that any other
implementation must beat it on at least one benchmark.

Patricia

Re: datastructure classes

Posted by Mike McGrady <mm...@topiatechnology.com>.

Note that multiple JVMs impact real-time constraints in ways that demand farsighted planning.

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 14, 2010, at 3:19 AM, Carfield Yim <ca...@carfield.com.hk> wrote:

> Hi, copyonwritearraylist is suitable for this case imho. But it don't sound
> like an option to you. Would you share a bit more. I am interest to learn
> about it.
> On Dec 14, 2010 9:15 AM, "Patricia Shanahan" <pa...@acm.org> wrote:
>> 
>> You make a good point that I should take a look around to see if I can
> find anything suitable that already exists, with appropriate licensing.
>> 
>> That said, I believe the requirements for an outrigger FastList are:
>> 
>> 1. Highly scalable performance. In particular, iterating over the list
> should be fast even when many threads are doing it. As far as I can tell,
> any JavaSpace lookup turns into iterating over a FastList representing items
> of acceptable classes, to see if any of the nodes match the other template
> requirements.
>> 
>> 2. A limited set of features: Iterate over the list starting at the head,
> remove a node, add at tail.
>> 
>> The two libraries each go in a different direction.
>> 
>> Javolution is primarily a single thread speed play, with fast and
> predictable performance. It is what I would want if I were writing real time
> code. It does have some support for concurrency, but not what is needed for
> outrigger: "Fast collections (or maps) can be made thread-safe by marking
> them FastCollection#shared shared Having a shared collection (or map) means
> that it can be safely used without synchronization. It does not mean that a
> change made by a thread is automatically viewed by another thread (that
> would require synchronizing)."
>> 
>> The Apache Commons collections are very feature rich, but seem to be
> primarily single thread, with the synchronizing decorator approach to
> concurrency.
>> 
>> I will do some more web searching, but my current strongest contender for
> an existing class is java.util.concurrent.ConcurrentLinkedQueue. It's main
> limitation is slow remove, but that might be worked around by using a pair
> of queues, and periodically cleaning by copying from one queue to another.
> Also, java.util.concurrent contains a reference to an important paper,
> http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting
> point for finding the latest research on these issues.
>> 
>> Patricia
>> 
>> 
>> 
>> 
>> 
>> On 12/13/2010 4:30 PM, James Grahn wrote:
>>> 
>>> Because there've been a few concerns about memory models and concurrent
>>> data structures, I thought I'd suggest a couple of resources. Library
>>> usage is preferable to reinventing the wheel, where possible:
>>> 
>>> First:
>>> http://javolution.org/
>>> 
>>> I haven't used it directly; I've only used a library which used it. But
>>> it perhaps deserves a scan to see if it would meet our requirements (and
>>> testing to ensure it's acceptable if it claims to).
>>> 
>>> Also, it's BSD-licensed. I believe that's acceptable?
>>> 
>>> Second:
>>> http://commons.apache.org/collections/index.html
>>> 
>>> I don't *think* it has what we need, but it's worth poking around in
>>> before creating a new datastructure. Has the advantage of being part of
>>> the Apache family.
>>> 
>>> jamesG
>>> 
>>> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>>>> 
>>>> Gregg Wonderly wrote:
>>>>> 
>>>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>>>> 
>>>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>>> 
>>>>>> I am not sure that I have time to really look over this code. I
>>>>>> wonder if anyone
>>>>>> knows if this is relatively new code that John put together as part
>>>>>> of the
>>>>>> effort to remove the use of PSE from outrigger, or is the original
>>>>>> "non-persistent" javaspaces implementation?
>>>>>> 
>>>>>> Perhaps we need to do something different here, a segmented list for
>>>>>> example,
>>>>>> which is what PSE did with it's Vector implementation so that
>>>>>> segments of the
>>>>>> list could be locked independently, as well as allowing the segments
>>>>>> to be
>>>>>> "deleted" from disk once they were "empty".
>>>>> 
>>>>> 
>>>>> And, of course if we pull out outrigger, as an application/service,
>>>>> separate from the Jini part of river, we could just say that Outrigger
>>>>> requires JDK1.5 so that we can move to a different concurrency
>>>>> implementation if that is needed, using the new memory model.
>>>> 
>>>> 
>>>> It seems like a lot of work to fix a package private implementation,
>>>> already based on flawed assumptions (Patricia's done a great job
>>>> debugging this, she's a real asset, I look forward to learning more
>>>> debugging tips). I suspect we'll get a much better result starting from
>>>> scratch, utilising Java 6.
>>>> 
>>>> Since River is a Jini platform, why don't we start by creating an
>>>> independent implementation of outrigger utilising any latest available
>>>> java features. Not only will this produce a better implementation, that
>>>> will be easier to support, but it might improve our understanding of
>>>> what's required for a modular build as well.
>>>> 
>>>> The existing outrigger implementation can remain as it is, but
>>>> deprecated, left in place for legacy support.
>>>> 
>>>> I've got some concurrent utilities in pepe you may pinch / improve if
>>>> you like:
>>>> 
>>>> org.apache.river.impl.util.*
>>>> 
>>>> ConcurrentCollections
>>>> ConcurrentSoftMap
>>>> ConcurrentWeakIdentityMap
>>>> ConcurrentWeakMap
>>>> 
>>>> ConcurrentCollections is a multi read single write lock based collection
>>>> wrapper. It defensively copies for Iterators but still allows performing
>>>> removals from the underlying collection.
>>>> 
>>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>>> ReferenceQueue to remove stale entries prior to every method call.
>>>> Cheers,
>>>> 
>>>> Peter.
>>>> 
>>>>> 
>>>>> One of the things I did in griddle, was define a RemoteIterator which
>>>>> allows a "get" operation to have a return value before anything is in
>>>>> the space. A server side thread, then looks through the space for
>>>>> matches and adds them to the return queue for the calling "get". This
>>>>> allows server side contention to be managed to some degree because the
>>>>> number of "searching" threads could be held to an appropriate minimum
>>>>> (even one). The javaspaces API doesn't disallow such proxy
>>>>> implementation and JavaSpaces05 iterator support starts to expose this
>>>>> kind of thing more literally.
>>>>> 
>>>>> Ultimately, I think a segmented list that looks like a
>>>>> List<List<Entry>> would be a way to distribute locking for "get"
>>>>> because iteration would be less likely to be on the same segment at
>>>>> the same time.
>>>>> 
>>>>> Insertion at the tail, is always a contentious issue for concurrent
>>>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>>>> adds until it gets to a particular size, and then turn its contents
>>>>> into a List and add that list to the tail of the List<List<Entry>>.
>>>>> You can choose then to decide when to do that movement by watching for
>>>>> the other lists to be empty too, or when a traversal gets to the end
>>>>> of what is in a list already.
>>>>> 
>>>>> This keeps writers quite free to insert quickly and completely away
>>>>> from the readers as well. Ordering presents most of the opportunities
>>>>> for big time contention. Unordered (map), or more segmented (map of
>>>>> list) construction relieves some of the contention if course, by
>>>>> distributing it.
>>>>> 
>>>>> Gregg
>>>>> 
>>>>>> Gregg
>>>>>> 
>>>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>>>> 
>>>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>>>> 
>>>>>>>> Patricia Shanahan wrote:
>>>>>>>>> 
>>>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>>>> ...
>>>>>>>>>> The important issue in FastList is that it was written with the
>>>>>>>>> JDK1.4
>>>>>>>>>> memory model. After moving River to Java 1.5, we'd have the
>>>>>>>>> JSR166 work
>>>>>>>>>> and the new, consistent memory model where volatile has a true
>>>>>>>>> meaning.
>>>>>>>>>> However, this code in particular is quite complex as you have
>>>>>>>>> noted, so
>>>>>>>>>> even adjusting to the new memory model could be problematic.
>>>>>>>>> 
>>>>>>>>> I've just run a modified, simplified version of my test with java
>>>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>>>>> get
>>>>>>>>> the NullPointerException. This changes my thinking a bit. I had
> been
>>>>>>>>> working from the assumption that the issue was to do with the
>>>>>>>>> changes
>>>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>>>>> model.
>>>>>>>>> 
>>>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>>>> who
>>>>>>>>> would like to help, please reply. I would welcome another set of
>>>>>>>>> eyes
>>>>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>>>>> FastList is supposed to work. There seems to be a critical
> invariant
>>>>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>>>>> NullPointerException or dropped items.
>>>>>>>>> 
>>>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a
> main
>>>>>>>>> program (JDK 1.4 or later0. In both forms, all

Re: datastructure classes

Posted by Carfield Yim <ca...@carfield.com.hk>.

Hi, copyonwritearraylist is suitable for this case imho. But it don't sound
like an option to you. Would you share a bit more. I am interest to learn
about it.
On Dec 14, 2010 9:15 AM, "Patricia Shanahan" <pa...@acm.org> wrote:
>
> You make a good point that I should take a look around to see if I can
find anything suitable that already exists, with appropriate licensing.
>
> That said, I believe the requirements for an outrigger FastList are:
>
> 1. Highly scalable performance. In particular, iterating over the list
should be fast even when many threads are doing it. As far as I can tell,
any JavaSpace lookup turns into iterating over a FastList representing items
of acceptable classes, to see if any of the nodes match the other template
requirements.
>
> 2. A limited set of features: Iterate over the list starting at the head,
remove a node, add at tail.
>
> The two libraries each go in a different direction.
>
> Javolution is primarily a single thread speed play, with fast and
predictable performance. It is what I would want if I were writing real time
code. It does have some support for concurrency, but not what is needed for
outrigger: "Fast collections (or maps) can be made thread-safe by marking
them FastCollection#shared shared Having a shared collection (or map) means
that it can be safely used without synchronization. It does not mean that a
change made by a thread is automatically viewed by another thread (that
would require synchronizing)."
>
> The Apache Commons collections are very feature rich, but seem to be
primarily single thread, with the synchronizing decorator approach to
concurrency.
>
> I will do some more web searching, but my current strongest contender for
an existing class is java.util.concurrent.ConcurrentLinkedQueue. It's main
limitation is slow remove, but that might be worked around by using a pair
of queues, and periodically cleaning by copying from one queue to another.
Also, java.util.concurrent contains a reference to an important paper,
http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting
point for finding the latest research on these issues.
>
> Patricia
>
>
>
>
>
> On 12/13/2010 4:30 PM, James Grahn wrote:
>>
>> Because there've been a few concerns about memory models and concurrent
>> data structures, I thought I'd suggest a couple of resources. Library
>> usage is preferable to reinventing the wheel, where possible:
>>
>> First:
>> http://javolution.org/
>>
>> I haven't used it directly; I've only used a library which used it. But
>> it perhaps deserves a scan to see if it would meet our requirements (and
>> testing to ensure it's acceptable if it claims to).
>>
>> Also, it's BSD-licensed. I believe that's acceptable?
>>
>> Second:
>> http://commons.apache.org/collections/index.html
>>
>> I don't *think* it has what we need, but it's worth poking around in
>> before creating a new datastructure. Has the advantage of being part of
>> the Apache family.
>>
>> jamesG
>>
>> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>>>
>>> Gregg Wonderly wrote:
>>>>
>>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>>>
>>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>>
>>>>> I am not sure that I have time to really look over this code. I
>>>>> wonder if anyone
>>>>> knows if this is relatively new code that John put together as part
>>>>> of the
>>>>> effort to remove the use of PSE from outrigger, or is the original
>>>>> "non-persistent" javaspaces implementation?
>>>>>
>>>>> Perhaps we need to do something different here, a segmented list for
>>>>> example,
>>>>> which is what PSE did with it's Vector implementation so that
>>>>> segments of the
>>>>> list could be locked independently, as well as allowing the segments
>>>>> to be
>>>>> "deleted" from disk once they were "empty".
>>>>
>>>>
>>>> And, of course if we pull out outrigger, as an application/service,
>>>> separate from the Jini part of river, we could just say that Outrigger
>>>> requires JDK1.5 so that we can move to a different concurrency
>>>> implementation if that is needed, using the new memory model.
>>>
>>>
>>> It seems like a lot of work to fix a package private implementation,
>>> already based on flawed assumptions (Patricia's done a great job
>>> debugging this, she's a real asset, I look forward to learning more
>>> debugging tips). I suspect we'll get a much better result starting from
>>> scratch, utilising Java 6.
>>>
>>> Since River is a Jini platform, why don't we start by creating an
>>> independent implementation of outrigger utilising any latest available
>>> java features. Not only will this produce a better implementation, that
>>> will be easier to support, but it might improve our understanding of
>>> what's required for a modular build as well.
>>>
>>> The existing outrigger implementation can remain as it is, but
>>> deprecated, left in place for legacy support.
>>>
>>> I've got some concurrent utilities in pepe you may pinch / improve if
>>> you like:
>>>
>>> org.apache.river.impl.util.*
>>>
>>> ConcurrentCollections
>>> ConcurrentSoftMap
>>> ConcurrentWeakIdentityMap
>>> ConcurrentWeakMap
>>>
>>> ConcurrentCollections is a multi read single write lock based collection
>>> wrapper. It defensively copies for Iterators but still allows performing
>>> removals from the underlying collection.
>>>
>>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>>> ReferenceQueue to remove stale entries prior to every method call.
>>> Cheers,
>>>
>>> Peter.
>>>
>>>>
>>>> One of the things I did in griddle, was define a RemoteIterator which
>>>> allows a "get" operation to have a return value before anything is in
>>>> the space. A server side thread, then looks through the space for
>>>> matches and adds them to the return queue for the calling "get". This
>>>> allows server side contention to be managed to some degree because the
>>>> number of "searching" threads could be held to an appropriate minimum
>>>> (even one). The javaspaces API doesn't disallow such proxy
>>>> implementation and JavaSpaces05 iterator support starts to expose this
>>>> kind of thing more literally.
>>>>
>>>> Ultimately, I think a segmented list that looks like a
>>>> List<List<Entry>> would be a way to distribute locking for "get"
>>>> because iteration would be less likely to be on the same segment at
>>>> the same time.
>>>>
>>>> Insertion at the tail, is always a contentious issue for concurrent
>>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>>> adds until it gets to a particular size, and then turn its contents
>>>> into a List and add that list to the tail of the List<List<Entry>>.
>>>> You can choose then to decide when to do that movement by watching for
>>>> the other lists to be empty too, or when a traversal gets to the end
>>>> of what is in a list already.
>>>>
>>>> This keeps writers quite free to insert quickly and completely away
>>>> from the readers as well. Ordering presents most of the opportunities
>>>> for big time contention. Unordered (map), or more segmented (map of
>>>> list) construction relieves some of the contention if course, by
>>>> distributing it.
>>>>
>>>> Gregg
>>>>
>>>>> Gregg
>>>>>
>>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>>>
>>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>>>
>>>>>>> Patricia Shanahan wrote:
>>>>>>>>
>>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>>> ...
>>>>>>>> > The important issue in FastList is that it was written with the
>>>>>>>> JDK1.4
>>>>>>>> > memory model. After moving River to Java 1.5, we'd have the
>>>>>>>> JSR166 work
>>>>>>>> > and the new, consistent memory model where volatile has a true
>>>>>>>> meaning.
>>>>>>>> > However, this code in particular is quite complex as you have
>>>>>>>> noted, so
>>>>>>>> > even adjusting to the new memory model could be problematic.
>>>>>>>>
>>>>>>>> I've just run a modified, simplified version of my test with java
>>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>>>> get
>>>>>>>> the NullPointerException. This changes my thinking a bit. I had
been
>>>>>>>> working from the assumption that the issue was to do with the
>>>>>>>> changes
>>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>>>> model.
>>>>>>>>
>>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>>> who
>>>>>>>> would like to help, please reply. I would welcome another set of
>>>>>>>> eyes
>>>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>>>> FastList is supposed to work. There seems to be a critical
invariant
>>>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>>>> NullPointerException or dropped items.
>>>>>>>>
>>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a
main
>>>>>>>> program (JDK 1.4 or later0. In both forms, all

Re: datastructure classes

Posted by Patricia Shanahan <pa...@acm.org>.

You make a good point that I should take a look around to see if I can 
find anything suitable that already exists, with appropriate licensing.

That said, I believe the requirements for an outrigger FastList are:

1. Highly scalable performance. In particular, iterating over the list 
should be fast even when many threads are doing it. As far as I can 
tell, any JavaSpace lookup turns into iterating over a FastList 
representing items of acceptable classes, to see if any of the nodes 
match the other template requirements.

2. A limited set of features: Iterate over the list starting at the 
head, remove a node, add at tail.

The two libraries each go in a different direction.

Javolution is primarily a single thread speed play, with fast and 
predictable performance. It is what I would want if I were writing real 
time code. It does have some support for concurrency, but not what is 
needed for outrigger: "Fast collections (or maps) can be made 
thread-safe by marking them FastCollection#shared shared Having a shared 
collection (or map) means that it can be safely used without 
synchronization. It does not mean that a change made by a thread is 
automatically viewed by another thread (that would require synchronizing)."

The Apache Commons collections are very feature rich, but seem to be 
primarily single thread, with the synchronizing decorator approach to 
concurrency.

I will do some more web searching, but my current strongest contender 
for an existing class is java.util.concurrent.ConcurrentLinkedQueue. 
It's main limitation is slow remove, but that might be worked around by 
using a pair of queues, and periodically cleaning by copying from one 
queue to another. Also, java.util.concurrent contains a reference to an 
important paper, http://www.cs.rochester.edu/u/michael/PODC96.html, that 
gives me a starting point for finding the latest research on these issues.

Patricia




On 12/13/2010 4:30 PM, James Grahn wrote:
> Because there've been a few concerns about memory models and concurrent
> data structures, I thought I'd suggest a couple of resources. Library
> usage is preferable to reinventing the wheel, where possible:
>
> First:
> http://javolution.org/
>
> I haven't used it directly; I've only used a library which used it. But
> it perhaps deserves a scan to see if it would meet our requirements (and
> testing to ensure it's acceptable if it claims to).
>
> Also, it's BSD-licensed. I believe that's acceptable?
>
> Second:
> http://commons.apache.org/collections/index.html
>
> I don't *think* it has what we need, but it's worth poking around in
> before creating a new datastructure. Has the advantage of being part of
> the Apache family.
>
> jamesG
>
> On 12/13/2010 5:24 PM, Peter Firmstone wrote:
>> Gregg Wonderly wrote:
>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>>
>>>> I am not sure that I have time to really look over this code. I
>>>> wonder if anyone
>>>> knows if this is relatively new code that John put together as part
>>>> of the
>>>> effort to remove the use of PSE from outrigger, or is the original
>>>> "non-persistent" javaspaces implementation?
>>>>
>>>> Perhaps we need to do something different here, a segmented list for
>>>> example,
>>>> which is what PSE did with it's Vector implementation so that
>>>> segments of the
>>>> list could be locked independently, as well as allowing the segments
>>>> to be
>>>> "deleted" from disk once they were "empty".
>>>
>>> And, of course if we pull out outrigger, as an application/service,
>>> separate from the Jini part of river, we could just say that Outrigger
>>> requires JDK1.5 so that we can move to a different concurrency
>>> implementation if that is needed, using the new memory model.
>>
>> It seems like a lot of work to fix a package private implementation,
>> already based on flawed assumptions (Patricia's done a great job
>> debugging this, she's a real asset, I look forward to learning more
>> debugging tips). I suspect we'll get a much better result starting from
>> scratch, utilising Java 6.
>>
>> Since River is a Jini platform, why don't we start by creating an
>> independent implementation of outrigger utilising any latest available
>> java features. Not only will this produce a better implementation, that
>> will be easier to support, but it might improve our understanding of
>> what's required for a modular build as well.
>>
>> The existing outrigger implementation can remain as it is, but
>> deprecated, left in place for legacy support.
>>
>> I've got some concurrent utilities in pepe you may pinch / improve if
>> you like:
>>
>> org.apache.river.impl.util.*
>>
>> ConcurrentCollections
>> ConcurrentSoftMap
>> ConcurrentWeakIdentityMap
>> ConcurrentWeakMap
>>
>> ConcurrentCollections is a multi read single write lock based collection
>> wrapper. It defensively copies for Iterators but still allows performing
>> removals from the underlying collection.
>>
>> The ConcurrentMap's are based on ConcurrentHashMap, using a
>> ReferenceQueue to remove stale entries prior to every method call.
>> Cheers,
>>
>> Peter.
>>
>>>
>>> One of the things I did in griddle, was define a RemoteIterator which
>>> allows a "get" operation to have a return value before anything is in
>>> the space. A server side thread, then looks through the space for
>>> matches and adds them to the return queue for the calling "get". This
>>> allows server side contention to be managed to some degree because the
>>> number of "searching" threads could be held to an appropriate minimum
>>> (even one). The javaspaces API doesn't disallow such proxy
>>> implementation and JavaSpaces05 iterator support starts to expose this
>>> kind of thing more literally.
>>>
>>> Ultimately, I think a segmented list that looks like a
>>> List<List<Entry>> would be a way to distribute locking for "get"
>>> because iteration would be less likely to be on the same segment at
>>> the same time.
>>>
>>> Insertion at the tail, is always a contentious issue for concurrent
>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>>> adds until it gets to a particular size, and then turn its contents
>>> into a List and add that list to the tail of the List<List<Entry>>.
>>> You can choose then to decide when to do that movement by watching for
>>> the other lists to be empty too, or when a traversal gets to the end
>>> of what is in a list already.
>>>
>>> This keeps writers quite free to insert quickly and completely away
>>> from the readers as well. Ordering presents most of the opportunities
>>> for big time contention. Unordered (map), or more segmented (map of
>>> list) construction relieves some of the contention if course, by
>>> distributing it.
>>>
>>> Gregg
>>>
>>>> Gregg
>>>>
>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>>> Patricia Shanahan wrote:
>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>>> ...
>>>>>>> > The important issue in FastList is that it was written with the
>>>>>>> JDK1.4
>>>>>>> > memory model. After moving River to Java 1.5, we'd have the
>>>>>>> JSR166 work
>>>>>>> > and the new, consistent memory model where volatile has a true
>>>>>>> meaning.
>>>>>>> > However, this code in particular is quite complex as you have
>>>>>>> noted, so
>>>>>>> > even adjusting to the new memory model could be problematic.
>>>>>>>
>>>>>>> I've just run a modified, simplified version of my test with java
>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>>> get
>>>>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>>>>> working from the assumption that the issue was to do with the
>>>>>>> changes
>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>>> model.
>>>>>>>
>>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>>> who
>>>>>>> would like to help, please reply. I would welcome another set of
>>>>>>> eyes
>>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>>> FastList is supposed to work. There seems to be a critical invariant
>>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>>> NullPointerException or dropped items.
>>>>>>>
>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>>>>> mixture of threads that repeatedly add items to a FastList and
>>>>>>> threads
>>>>>>> that repeatedly remove the first item they can from the FastList.
>>>>>>> Failures seem to require simultaneous adds and removes.
>>>>>>>
>>>>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>>>>> rather complicated, code and switch to writing a concurrent high
>>>>>>> performance FastList substitute for 1.5 or later.
>>>>>>>
>>>>>>> Patricia
>>>>>>>
>>>>>> I'll have a look tonight, no promises though ;)
>>>>>
>>>>> I'm attaching the simplified test application main program that can
>>>>> run, and
>>>>> fail, under JRE 1.4, with no need for Junit.
>>>>>
>>>>> Here's what I think I know. First of all, I have found some dubious
>>>>> synchronization situations. However, fixing all the things I have so
>>>>> far found
>>>>> of that type has only reduced the failure rate, not eliminated
>>>>> failures. That
>>>>> could be caused by changing timings without having any impact on the
>>>>> root cause.
>>>>>
>>>>> The key invariant relates to a thread that is doing a scan, starting
>>>>> with a call
>>>>> to head() and proceeding through a series of calls to next() to
>>>>> examine nodes.
>>>>> The head() call sets up a guard node for the thread that was the
>>>>> tail at some
>>>>> point during the head call. The invariant is that the series of
>>>>> next() calls
>>>>> will reach the guard node before finding a null next pointer,
>>>>> indicating the
>>>>> actual tail.
>>>>>
>>>>> The remove call does not really remove anything, it merely marks the
>>>>> node
>>>>> removed. Removed nodes are unlinked as a side effect of scans,
>>>>> during head and
>>>>> next calls, but only if they are not guard nodes.
>>>>>
>>>>> There are additional complications in the restart and reap methods,
>>>>> but we can
>>>>> ignore them for now - my test does not use them.
>>>>>
>>>>> Once a guard node is lost, the synchronization breaks down
>>>>> completely, because
>>>>> e.g. insertion at tail is protected by synchronization on the
>>>>> FastList instance,
>>>>> but unlinking of a removed node in the middle is protected, to the
>>>>> extent it is
>>>>> protected at all, by synchronization on the FastList.Node instance
>>>>> that is being
>>>>> removed.
>>>>>
>>>>> The commonest failure symptom is a scan reaching the null next
>>>>> pointer at the
>>>>> end of the FIFO during head(), without first finding the guard node
>>>>> it just set
>>>>> up. An alternative form of failure is loss of some entries - they
>>>>> get added, but
>>>>> the remove threads never find them. The second symptom is
>>>>> predominant in the
>>>>> JavaSpaces stress test that got me started on this. Messing up a
>>>>> next pointer
>>>>> could cause either.
>>>>>
>>>>> Incidentally, I'm curious about why it has such a fragile system in
>>>>> which the
>>>>> state of a scan is partly tracked by thread, when it seems like an
>>>>> obvious
>>>>> candidate for the Iterator pattern. Callers do need to be able to
>>>>> find out if a
>>>>> remove call succeeded or not (the node may have been removed by
>>>>> another thread),
>>>>> but that could be done in an interface extending Iterator. The
>>>>> WeakHashMap in a
>>>>> node that keeps track of the threads for which it is a guard would
>>>>> instead track
>>>>> the Iterator. There would be no need for thread local storage, the
>>>>> same data
>>>>> could be kept in the Iterator.
>>>>>
>>>>> Thanks for any time you can spend looking at this.
>>>>>
>>>>> Patricia
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>

datastructure classes

Posted by James Grahn <jg...@simulexinc.com>.

Because there've been a few concerns about memory models and concurrent 
data structures, I thought I'd suggest a couple of resources.   Library 
usage is preferable to reinventing the wheel, where possible:

First:
http://javolution.org/

I haven't used it directly; I've only used a library which used it. 
But it perhaps deserves a scan to see if it would meet our requirements 
(and testing to ensure it's acceptable if it claims to).

Also, it's BSD-licensed.   I believe that's acceptable?

Second:
http://commons.apache.org/collections/index.html

I don't *think* it has what we need, but it's worth poking around in 
before creating a new datastructure.   Has the advantage of being part 
of the Apache family.

jamesG

On 12/13/2010 5:24 PM, Peter Firmstone wrote:
> Gregg Wonderly wrote:
>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>>> This does fail fairly quickly (immediately) on my windows laptop.
>>>
>>> I am not sure that I have time to really look over this code. I
>>> wonder if anyone
>>> knows if this is relatively new code that John put together as part
>>> of the
>>> effort to remove the use of PSE from outrigger, or is the original
>>> "non-persistent" javaspaces implementation?
>>>
>>> Perhaps we need to do something different here, a segmented list for
>>> example,
>>> which is what PSE did with it's Vector implementation so that
>>> segments of the
>>> list could be locked independently, as well as allowing the segments
>>> to be
>>> "deleted" from disk once they were "empty".
>>
>> And, of course if we pull out outrigger, as an application/service,
>> separate from the Jini part of river, we could just say that Outrigger
>> requires JDK1.5 so that we can move to a different concurrency
>> implementation if that is needed, using the new memory model.
>
> It seems like a lot of work to fix a package private implementation,
> already based on flawed assumptions (Patricia's done a great job
> debugging this, she's a real asset, I look forward to learning more
> debugging tips). I suspect we'll get a much better result starting from
> scratch, utilising Java 6.
>
> Since River is a Jini platform, why don't we start by creating an
> independent implementation of outrigger utilising any latest available
> java features. Not only will this produce a better implementation, that
> will be easier to support, but it might improve our understanding of
> what's required for a modular build as well.
>
> The existing outrigger implementation can remain as it is, but
> deprecated, left in place for legacy support.
>
> I've got some concurrent utilities in pepe you may pinch / improve if
> you like:
>
> org.apache.river.impl.util.*
>
> ConcurrentCollections
> ConcurrentSoftMap
> ConcurrentWeakIdentityMap
> ConcurrentWeakMap
>
> ConcurrentCollections is a multi read single write lock based collection
> wrapper. It defensively copies for Iterators but still allows performing
> removals from the underlying collection.
>
> The ConcurrentMap's are based on ConcurrentHashMap, using a
> ReferenceQueue to remove stale entries prior to every method call.
> Cheers,
>
> Peter.
>
>>
>> One of the things I did in griddle, was define a RemoteIterator which
>> allows a "get" operation to have a return value before anything is in
>> the space. A server side thread, then looks through the space for
>> matches and adds them to the return queue for the calling "get". This
>> allows server side contention to be managed to some degree because the
>> number of "searching" threads could be held to an appropriate minimum
>> (even one). The javaspaces API doesn't disallow such proxy
>> implementation and JavaSpaces05 iterator support starts to expose this
>> kind of thing more literally.
>>
>> Ultimately, I think a segmented list that looks like a
>> List<List<Entry>> would be a way to distribute locking for "get"
>> because iteration would be less likely to be on the same segment at
>> the same time.
>>
>> Insertion at the tail, is always a contentious issue for concurrent
>> lists. Sometimes you can just use a small ConcurrentHashMap to perform
>> adds until it gets to a particular size, and then turn its contents
>> into a List and add that list to the tail of the List<List<Entry>>.
>> You can choose then to decide when to do that movement by watching for
>> the other lists to be empty too, or when a traversal gets to the end
>> of what is in a list already.
>>
>> This keeps writers quite free to insert quickly and completely away
>> from the readers as well. Ordering presents most of the opportunities
>> for big time contention. Unordered (map), or more segmented (map of
>> list) construction relieves some of the contention if course, by
>> distributing it.
>>
>> Gregg
>>
>>> Gregg
>>>
>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>>> Patricia Shanahan wrote:
>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>>> ...
>>>>>> > The important issue in FastList is that it was written with the
>>>>>> JDK1.4
>>>>>> > memory model. After moving River to Java 1.5, we'd have the
>>>>>> JSR166 work
>>>>>> > and the new, consistent memory model where volatile has a true
>>>>>> meaning.
>>>>>> > However, this code in particular is quite complex as you have
>>>>>> noted, so
>>>>>> > even adjusting to the new memory model could be problematic.
>>>>>>
>>>>>> I've just run a modified, simplified version of my test with java
>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still
>>>>>> get
>>>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>>>> working from the assumption that the issue was to do with the changes
>>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>>> possibility of a more basic bug that is independent of the memory
>>>>>> model.
>>>>>>
>>>>>> If there is anyone with a FastList or Java memory model background
>>>>>> who
>>>>>> would like to help, please reply. I would welcome another set of eyes
>>>>>> on the code, and a cross check on my conclusions so far about how
>>>>>> FastList is supposed to work. There seems to be a critical invariant
>>>>>> that gets broken, and once that happens we are on track to either a
>>>>>> NullPointerException or dropped items.
>>>>>>
>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>>>> mixture of threads that repeatedly add items to a FastList and
>>>>>> threads
>>>>>> that repeatedly remove the first item they can from the FastList.
>>>>>> Failures seem to require simultaneous adds and removes.
>>>>>>
>>>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>>>> rather complicated, code and switch to writing a concurrent high
>>>>>> performance FastList substitute for 1.5 or later.
>>>>>>
>>>>>> Patricia
>>>>>>
>>>>> I'll have a look tonight, no promises though ;)
>>>>
>>>> I'm attaching the simplified test application main program that can
>>>> run, and
>>>> fail, under JRE 1.4, with no need for Junit.
>>>>
>>>> Here's what I think I know. First of all, I have found some dubious
>>>> synchronization situations. However, fixing all the things I have so
>>>> far found
>>>> of that type has only reduced the failure rate, not eliminated
>>>> failures. That
>>>> could be caused by changing timings without having any impact on the
>>>> root cause.
>>>>
>>>> The key invariant relates to a thread that is doing a scan, starting
>>>> with a call
>>>> to head() and proceeding through a series of calls to next() to
>>>> examine nodes.
>>>> The head() call sets up a guard node for the thread that was the
>>>> tail at some
>>>> point during the head call. The invariant is that the series of
>>>> next() calls
>>>> will reach the guard node before finding a null next pointer,
>>>> indicating the
>>>> actual tail.
>>>>
>>>> The remove call does not really remove anything, it merely marks the
>>>> node
>>>> removed. Removed nodes are unlinked as a side effect of scans,
>>>> during head and
>>>> next calls, but only if they are not guard nodes.
>>>>
>>>> There are additional complications in the restart and reap methods,
>>>> but we can
>>>> ignore them for now - my test does not use them.
>>>>
>>>> Once a guard node is lost, the synchronization breaks down
>>>> completely, because
>>>> e.g. insertion at tail is protected by synchronization on the
>>>> FastList instance,
>>>> but unlinking of a removed node in the middle is protected, to the
>>>> extent it is
>>>> protected at all, by synchronization on the FastList.Node instance
>>>> that is being
>>>> removed.
>>>>
>>>> The commonest failure symptom is a scan reaching the null next
>>>> pointer at the
>>>> end of the FIFO during head(), without first finding the guard node
>>>> it just set
>>>> up. An alternative form of failure is loss of some entries - they
>>>> get added, but
>>>> the remove threads never find them. The second symptom is
>>>> predominant in the
>>>> JavaSpaces stress test that got me started on this. Messing up a
>>>> next pointer
>>>> could cause either.
>>>>
>>>> Incidentally, I'm curious about why it has such a fragile system in
>>>> which the
>>>> state of a scan is partly tracked by thread, when it seems like an
>>>> obvious
>>>> candidate for the Iterator pattern. Callers do need to be able to
>>>> find out if a
>>>> remove call succeeded or not (the node may have been removed by
>>>> another thread),
>>>> but that could be done in an interface extending Iterator. The
>>>> WeakHashMap in a
>>>> node that keeps track of the threads for which it is a guard would
>>>> instead track
>>>> the Iterator. There would be no need for thread local storage, the
>>>> same data
>>>> could be kept in the Iterator.
>>>>
>>>> Thanks for any time you can spend looking at this.
>>>>
>>>> Patricia
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/13/2010 2:24 PM, Peter Firmstone wrote:
> Gregg Wonderly wrote:
...
>> And, of course if we pull out outrigger, as an application/service,
>> separate from the Jini part of river, we could just say that Outrigger
>> requires JDK1.5 so that we can move to a different concurrency
>> implementation if that is needed, using the new memory model.
>
> It seems like a lot of work to fix a package private implementation,
> already based on flawed assumptions (Patricia's done a great job
> debugging this, she's a real asset, I look forward to learning more
> debugging tips). I suspect we'll get a much better result starting from
> scratch, utilising Java 6.
>
> Since River is a Jini platform, why don't we start by creating an
> independent implementation of outrigger utilising any latest available
> java features. Not only will this produce a better implementation, that
> will be easier to support, but it might improve our understanding of
> what's required for a modular build as well.

I think the big payoff is going to 1.5. I have not yet found the exact 
bug in FastList but I think the root cause is probably over-complicated 
concurrency leading to lost nodes.

Although FastList should be the bottleneck in outrigger, because it is 
where the real work is done, I'm not sure it is. I got faster failures, 
probably due to a higher transaction rate, when I dropped the QA test 
that went through the JavaSpaces service in favor of a unit test that 
directly accesses FastList.

My next plan is to write a FastList replacement that is at least as good 
as the current one from the point of view of concurrency, but simpler 
and more maintainable. It will depend on the 1.5 memory model and the 
corresponding version of java.util and its sub-packages. I will also 
write additional unit tests in the spirit of the add/remove test that 
demonstrates the current problem. More maintainable interfaces plus a 
better unit test will be a good basis if we need to reduce contention 
between threads by some of the more extreme ideas.

I regard this as a bug fix exercise, but the 1.5 requirement may affect 
how and when it should be released. Should I create a new package for 
the 1.5 dependent version of outrigger? Any thoughts?

Patricia

Re: Progress, and a problem

Posted by Peter Firmstone <ji...@zeus.net.au>.

Gregg Wonderly wrote:
> On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
>> This does fail fairly quickly (immediately) on my windows laptop.
>>
>> I am not sure that I have time to really look over this code. I 
>> wonder if anyone
>> knows if this is relatively new code that John put together as part 
>> of the
>> effort to remove the use of PSE from outrigger, or is the original
>> "non-persistent" javaspaces implementation?
>>
>> Perhaps we need to do something different here, a segmented list for 
>> example,
>> which is what PSE did with it's Vector implementation so that 
>> segments of the
>> list could be locked independently, as well as allowing the segments 
>> to be
>> "deleted" from disk once they were "empty".
>
> And, of course if we pull out outrigger, as an application/service, 
> separate from the Jini part of river, we could just say that Outrigger 
> requires JDK1.5 so that we can move to a different concurrency 
> implementation if that is needed, using the new memory model.

It seems like a lot of work to fix a package private implementation, 
already based on flawed assumptions (Patricia's done a great job 
debugging this, she's a real asset, I look forward to learning more 
debugging tips).  I suspect we'll get a much better result starting from 
scratch, utilising Java 6.

Since River is a Jini platform, why don't we start by creating an 
independent implementation of outrigger utilising any latest available 
java features.  Not only will this produce a better implementation, that 
will be easier to support, but it might improve our understanding of 
what's required for a modular build as well.

The existing outrigger implementation can remain as it is, but 
deprecated, left in place for legacy support.

I've got some concurrent utilities in pepe you may pinch / improve if 
you like:

org.apache.river.impl.util.*

ConcurrentCollections
ConcurrentSoftMap
ConcurrentWeakIdentityMap
ConcurrentWeakMap

ConcurrentCollections is a multi read single write lock based collection 
wrapper.  It defensively copies for Iterators but still allows 
performing removals from the underlying collection.

The ConcurrentMap's are based on ConcurrentHashMap, using a 
ReferenceQueue to remove stale entries prior to every method call.
Cheers,

Peter.

>
> One of the things I did in griddle, was define a RemoteIterator which 
> allows a "get" operation to have a return value before anything is in 
> the space.  A server side thread, then looks through the space for 
> matches and adds them to the return queue for the calling "get".  This 
> allows server side contention to be managed to some degree because the 
> number of "searching" threads could be held to an appropriate minimum 
> (even one).  The javaspaces API doesn't disallow such proxy 
> implementation and JavaSpaces05 iterator support starts to expose this 
> kind of thing more literally.
>
> Ultimately, I think a segmented list that looks like a 
> List<List<Entry>> would be a way to distribute locking for "get" 
> because iteration would be less likely to be on the same segment at 
> the same time.
>
> Insertion at the tail, is always a contentious issue for concurrent 
> lists. Sometimes you can just use a small ConcurrentHashMap to perform 
> adds until it gets to a particular size, and then turn its contents 
> into a List and add that list to the tail of the List<List<Entry>>.  
> You can choose then to decide when to do that movement by watching for 
> the other lists to be empty too, or when a traversal gets to the end 
> of what is in a list already.
>
> This keeps writers quite free to insert quickly and completely away 
> from the readers as well.  Ordering presents most of the opportunities 
> for big time contention.  Unordered (map), or more segmented (map of 
> list) construction relieves some of the contention if course, by 
> distributing it.
>
> Gregg
>
>> Gregg
>>
>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>>> Patricia Shanahan wrote:
>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>>> ...
>>>>> > The important issue in FastList is that it was written with the 
>>>>> JDK1.4
>>>>> > memory model. After moving River to Java 1.5, we'd have the 
>>>>> JSR166 work
>>>>> > and the new, consistent memory model where volatile has a true 
>>>>> meaning.
>>>>> > However, this code in particular is quite complex as you have 
>>>>> noted, so
>>>>> > even adjusting to the new memory model could be problematic.
>>>>>
>>>>> I've just run a modified, simplified version of my test with java
>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still 
>>>>> get
>>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>>> working from the assumption that the issue was to do with the changes
>>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>>> possibility of a more basic bug that is independent of the memory 
>>>>> model.
>>>>>
>>>>> If there is anyone with a FastList or Java memory model background 
>>>>> who
>>>>> would like to help, please reply. I would welcome another set of eyes
>>>>> on the code, and a cross check on my conclusions so far about how
>>>>> FastList is supposed to work. There seems to be a critical invariant
>>>>> that gets broken, and once that happens we are on track to either a
>>>>> NullPointerException or dropped items.
>>>>>
>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>>> mixture of threads that repeatedly add items to a FastList and 
>>>>> threads
>>>>> that repeatedly remove the first item they can from the FastList.
>>>>> Failures seem to require simultaneous adds and removes.
>>>>>
>>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>>> rather complicated, code and switch to writing a concurrent high
>>>>> performance FastList substitute for 1.5 or later.
>>>>>
>>>>> Patricia
>>>>>
>>>> I'll have a look tonight, no promises though ;)
>>>
>>> I'm attaching the simplified test application main program that can 
>>> run, and
>>> fail, under JRE 1.4, with no need for Junit.
>>>
>>> Here's what I think I know. First of all, I have found some dubious
>>> synchronization situations. However, fixing all the things I have so 
>>> far found
>>> of that type has only reduced the failure rate, not eliminated 
>>> failures. That
>>> could be caused by changing timings without having any impact on the 
>>> root cause.
>>>
>>> The key invariant relates to a thread that is doing a scan, starting 
>>> with a call
>>> to head() and proceeding through a series of calls to next() to 
>>> examine nodes.
>>> The head() call sets up a guard node for the thread that was the 
>>> tail at some
>>> point during the head call. The invariant is that the series of 
>>> next() calls
>>> will reach the guard node before finding a null next pointer, 
>>> indicating the
>>> actual tail.
>>>
>>> The remove call does not really remove anything, it merely marks the 
>>> node
>>> removed. Removed nodes are unlinked as a side effect of scans, 
>>> during head and
>>> next calls, but only if they are not guard nodes.
>>>
>>> There are additional complications in the restart and reap methods, 
>>> but we can
>>> ignore them for now - my test does not use them.
>>>
>>> Once a guard node is lost, the synchronization breaks down 
>>> completely, because
>>> e.g. insertion at tail is protected by synchronization on the 
>>> FastList instance,
>>> but unlinking of a removed node in the middle is protected, to the 
>>> extent it is
>>> protected at all, by synchronization on the FastList.Node instance 
>>> that is being
>>> removed.
>>>
>>> The commonest failure symptom is a scan reaching the null next 
>>> pointer at the
>>> end of the FIFO during head(), without first finding the guard node 
>>> it just set
>>> up. An alternative form of failure is loss of some entries - they 
>>> get added, but
>>> the remove threads never find them. The second symptom is 
>>> predominant in the
>>> JavaSpaces stress test that got me started on this. Messing up a 
>>> next pointer
>>> could cause either.
>>>
>>> Incidentally, I'm curious about why it has such a fragile system in 
>>> which the
>>> state of a scan is partly tracked by thread, when it seems like an 
>>> obvious
>>> candidate for the Iterator pattern. Callers do need to be able to 
>>> find out if a
>>> remove call succeeded or not (the node may have been removed by 
>>> another thread),
>>> but that could be done in an interface extending Iterator. The 
>>> WeakHashMap in a
>>> node that keeps track of the threads for which it is a guard would 
>>> instead track
>>> the Iterator. There would be no need for thread local storage, the 
>>> same data
>>> could be kept in the Iterator.
>>>
>>> Thanks for any time you can spend looking at this.
>>>
>>> Patricia
>>>
>>>
>>
>>
>
>

Re: Progress, and a problem

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/13/2010 12:34 PM, Gregg Wonderly wrote:
> This does fail fairly quickly (immediately) on my windows laptop.
>
> I am not sure that I have time to really look over this code. I wonder if anyone
> knows if this is relatively new code that John put together as part of the
> effort to remove the use of PSE from outrigger, or is the original
> "non-persistent" javaspaces implementation?
>
> Perhaps we need to do something different here, a segmented list for example,
> which is what PSE did with it's Vector implementation so that segments of the
> list could be locked independently, as well as allowing the segments to be
> "deleted" from disk once they were "empty".

And, of course if we pull out outrigger, as an application/service, separate 
from the Jini part of river, we could just say that Outrigger requires JDK1.5 so 
that we can move to a different concurrency implementation if that is needed, 
using the new memory model.

One of the things I did in griddle, was define a RemoteIterator which allows a 
"get" operation to have a return value before anything is in the space.  A 
server side thread, then looks through the space for matches and adds them to 
the return queue for the calling "get".  This allows server side contention to 
be managed to some degree because the number of "searching" threads could be 
held to an appropriate minimum (even one).  The javaspaces API doesn't disallow 
such proxy implementation and JavaSpaces05 iterator support starts to expose 
this kind of thing more literally.

Ultimately, I think a segmented list that looks like a List<List<Entry>> would 
be a way to distribute locking for "get" because iteration would be less likely 
to be on the same segment at the same time.

Insertion at the tail, is always a contentious issue for concurrent lists. 
Sometimes you can just use a small ConcurrentHashMap to perform adds until it 
gets to a particular size, and then turn its contents into a List and add that 
list to the tail of the List<List<Entry>>.  You can choose then to decide when 
to do that movement by watching for the other lists to be empty too, or when a 
traversal gets to the end of what is in a list already.

This keeps writers quite free to insert quickly and completely away from the 
readers as well.  Ordering presents most of the opportunities for big time 
contention.  Unordered (map), or more segmented (map of list) construction 
relieves some of the contention if course, by distributing it.

Gregg

> Gregg
>
> On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
>> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>>> Patricia Shanahan wrote:
>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>>> ...
>>>> > The important issue in FastList is that it was written with the JDK1.4
>>>> > memory model. After moving River to Java 1.5, we'd have the JSR166 work
>>>> > and the new, consistent memory model where volatile has a true meaning.
>>>> > However, this code in particular is quite complex as you have noted, so
>>>> > even adjusting to the new memory model could be problematic.
>>>>
>>>> I've just run a modified, simplified version of my test with java
>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still get
>>>> the NullPointerException. This changes my thinking a bit. I had been
>>>> working from the assumption that the issue was to do with the changes
>>>> in memory model between 1.4 and 1.5. I now have to consider the
>>>> possibility of a more basic bug that is independent of the memory model.
>>>>
>>>> If there is anyone with a FastList or Java memory model background who
>>>> would like to help, please reply. I would welcome another set of eyes
>>>> on the code, and a cross check on my conclusions so far about how
>>>> FastList is supposed to work. There seems to be a critical invariant
>>>> that gets broken, and once that happens we are on track to either a
>>>> NullPointerException or dropped items.
>>>>
>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>>> mixture of threads that repeatedly add items to a FastList and threads
>>>> that repeatedly remove the first item they can from the FastList.
>>>> Failures seem to require simultaneous adds and removes.
>>>>
>>>> If I don't nail this problem fairly soon, I may abandon the current,
>>>> rather complicated, code and switch to writing a concurrent high
>>>> performance FastList substitute for 1.5 or later.
>>>>
>>>> Patricia
>>>>
>>> I'll have a look tonight, no promises though ;)
>>
>> I'm attaching the simplified test application main program that can run, and
>> fail, under JRE 1.4, with no need for Junit.
>>
>> Here's what I think I know. First of all, I have found some dubious
>> synchronization situations. However, fixing all the things I have so far found
>> of that type has only reduced the failure rate, not eliminated failures. That
>> could be caused by changing timings without having any impact on the root cause.
>>
>> The key invariant relates to a thread that is doing a scan, starting with a call
>> to head() and proceeding through a series of calls to next() to examine nodes.
>> The head() call sets up a guard node for the thread that was the tail at some
>> point during the head call. The invariant is that the series of next() calls
>> will reach the guard node before finding a null next pointer, indicating the
>> actual tail.
>>
>> The remove call does not really remove anything, it merely marks the node
>> removed. Removed nodes are unlinked as a side effect of scans, during head and
>> next calls, but only if they are not guard nodes.
>>
>> There are additional complications in the restart and reap methods, but we can
>> ignore them for now - my test does not use them.
>>
>> Once a guard node is lost, the synchronization breaks down completely, because
>> e.g. insertion at tail is protected by synchronization on the FastList instance,
>> but unlinking of a removed node in the middle is protected, to the extent it is
>> protected at all, by synchronization on the FastList.Node instance that is being
>> removed.
>>
>> The commonest failure symptom is a scan reaching the null next pointer at the
>> end of the FIFO during head(), without first finding the guard node it just set
>> up. An alternative form of failure is loss of some entries - they get added, but
>> the remove threads never find them. The second symptom is predominant in the
>> JavaSpaces stress test that got me started on this. Messing up a next pointer
>> could cause either.
>>
>> Incidentally, I'm curious about why it has such a fragile system in which the
>> state of a scan is partly tracked by thread, when it seems like an obvious
>> candidate for the Iterator pattern. Callers do need to be able to find out if a
>> remove call succeeded or not (the node may have been removed by another thread),
>> but that could be done in an interface extending Iterator. The WeakHashMap in a
>> node that keeps track of the threads for which it is a guard would instead track
>> the Iterator. There would be no need for thread local storage, the same data
>> could be kept in the Iterator.
>>
>> Thanks for any time you can spend looking at this.
>>
>> Patricia
>>
>>
>
>

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/13/2010 10:34 AM, Gregg Wonderly wrote:
> This does fail fairly quickly (immediately) on my windows laptop.
>
> I am not sure that I have time to really look over this code. I wonder
> if anyone knows if this is relatively new code that John put together as
> part of the effort to remove the use of PSE from outrigger, or is the
> original "non-persistent" javaspaces implementation?
>
> Perhaps we need to do something different here, a segmented list for
> example, which is what PSE did with it's Vector implementation so that
> segments of the list could be locked independently, as well as allowing
> the segments to be "deleted" from disk once they were "empty".
...

I have an alternative in-memory FIFO FastList implementation, 
IterableFastList, which replaces the head and next calls in FastList 
with an Iterator.

It has spent the last few hours running the quickly failing test (with 
minimal changes to use the new implementation) through 120 passes with 
no crashes or dropped nodes. I know exactly why it works, and I plan to 
write comments explaining it.

I now need some benchmarks, to both measure its performance and, to the 
extent possible given FastList's tendency to crash under stress, compare 
it to the current FastList. I'll go ahead and write my own, but anyone 
who wants to influence performance tuning in this area might want to add 
their benchmarks. If anyone is interested, I'll send them a copy of the 
new class.

I expect the main scaling limitation in the new class to be continued 
use of synchronization when adding a node, so that the tail pointer and 
the next pointer in the old tail node are always updated simultaneously. 
If that becomes a significant bottleneck in a JavaSpaces benchmark I'll 
work on more aggressive implementations.

Question: Where should I put benchmarks that target a specific 
implementation class? In the test/src hierarchy, alongside the JUnit 
tests? Does the build process assume that only contains unit tests?

Patricia

Re: Progress, and a problem

Posted by Gregg Wonderly <gr...@wonderly.org>.

This does fail fairly quickly (immediately) on my windows laptop.

I am not sure that I have time to really look over this code.  I wonder if 
anyone knows if this is relatively new code that John put together as part of 
the effort to remove the use of PSE from outrigger, or is the original 
"non-persistent" javaspaces implementation?

Perhaps we need to do something different here, a segmented list for example, 
which is what PSE did with it's Vector implementation so that segments of the 
list could be locked independently, as well as allowing the segments to be 
"deleted" from disk once they were "empty".

Gregg

On 12/13/2010 12:16 AM, Patricia Shanahan wrote:
> On 12/12/2010 5:48 PM, Peter Firmstone wrote:
>> Patricia Shanahan wrote:
>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>>> ...
>>> > The important issue in FastList is that it was written with the JDK1.4
>>> > memory model. After moving River to Java 1.5, we'd have the JSR166 work
>>> > and the new, consistent memory model where volatile has a true meaning.
>>> > However, this code in particular is quite complex as you have noted, so
>>> > even adjusting to the new memory model could be problematic.
>>>
>>> I've just run a modified, simplified version of my test with java
>>> version "1.4.2_19" and an unmodified copy of FastList, and I still get
>>> the NullPointerException. This changes my thinking a bit. I had been
>>> working from the assumption that the issue was to do with the changes
>>> in memory model between 1.4 and 1.5. I now have to consider the
>>> possibility of a more basic bug that is independent of the memory model.
>>>
>>> If there is anyone with a FastList or Java memory model background who
>>> would like to help, please reply. I would welcome another set of eyes
>>> on the code, and a cross check on my conclusions so far about how
>>> FastList is supposed to work. There seems to be a critical invariant
>>> that gets broken, and once that happens we are on track to either a
>>> NullPointerException or dropped items.
>>>
>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>>> mixture of threads that repeatedly add items to a FastList and threads
>>> that repeatedly remove the first item they can from the FastList.
>>> Failures seem to require simultaneous adds and removes.
>>>
>>> If I don't nail this problem fairly soon, I may abandon the current,
>>> rather complicated, code and switch to writing a concurrent high
>>> performance FastList substitute for 1.5 or later.
>>>
>>> Patricia
>>>
>> I'll have a look tonight, no promises though ;)
>
> I'm attaching the simplified test application main program that can run, and
> fail, under JRE 1.4, with no need for Junit.
>
> Here's what I think I know. First of all, I have found some dubious
> synchronization situations. However, fixing all the things I have so far found
> of that type has only reduced the failure rate, not eliminated failures. That
> could be caused by changing timings without having any impact on the root cause.
>
> The key invariant relates to a thread that is doing a scan, starting with a call
> to head() and proceeding through a series of calls to next() to examine nodes.
> The head() call sets up a guard node for the thread that was the tail at some
> point during the head call. The invariant is that the series of next() calls
> will reach the guard node before finding a null next pointer, indicating the
> actual tail.
>
> The remove call does not really remove anything, it merely marks the node
> removed. Removed nodes are unlinked as a side effect of scans, during head and
> next calls, but only if they are not guard nodes.
>
> There are additional complications in the restart and reap methods, but we can
> ignore them for now - my test does not use them.
>
> Once a guard node is lost, the synchronization breaks down completely, because
> e.g. insertion at tail is protected by synchronization on the FastList instance,
> but unlinking of a removed node in the middle is protected, to the extent it is
> protected at all, by synchronization on the FastList.Node instance that is being
> removed.
>
> The commonest failure symptom is a scan reaching the null next pointer at the
> end of the FIFO during head(), without first finding the guard node it just set
> up. An alternative form of failure is loss of some entries - they get added, but
> the remove threads never find them. The second symptom is predominant in the
> JavaSpaces stress test that got me started on this. Messing up a next pointer
> could cause either.
>
> Incidentally, I'm curious about why it has such a fragile system in which the
> state of a scan is partly tracked by thread, when it seems like an obvious
> candidate for the Iterator pattern. Callers do need to be able to find out if a
> remove call succeeded or not (the node may have been removed by another thread),
> but that could be done in an interface extending Iterator. The WeakHashMap in a
> node that keeps track of the threads for which it is a guard would instead track
> the Iterator. There would be no need for thread local storage, the same data
> could be kept in the Iterator.
>
> Thanks for any time you can spend looking at this.
>
> Patricia
>
>

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/12/2010 5:48 PM, Peter Firmstone wrote:
> Patricia Shanahan wrote:
>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
>> ...
>> > The important issue in FastList is that it was written with the JDK1.4
>> > memory model. After moving River to Java 1.5, we'd have the JSR166 work
>> > and the new, consistent memory model where volatile has a true meaning.
>> > However, this code in particular is quite complex as you have noted, so
>> > even adjusting to the new memory model could be problematic.
>>
>> I've just run a modified, simplified version of my test with java
>> version "1.4.2_19" and an unmodified copy of FastList, and I still get
>> the NullPointerException. This changes my thinking a bit. I had been
>> working from the assumption that the issue was to do with the changes
>> in memory model between 1.4 and 1.5. I now have to consider the
>> possibility of a more basic bug that is independent of the memory model.
>>
>> If there is anyone with a FastList or Java memory model background who
>> would like to help, please reply. I would welcome another set of eyes
>> on the code, and a cross check on my conclusions so far about how
>> FastList is supposed to work. There seems to be a critical invariant
>> that gets broken, and once that happens we are on track to either a
>> NullPointerException or dropped items.
>>
>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main
>> program (JDK 1.4 or later0. In both forms, all it does is fire up a
>> mixture of threads that repeatedly add items to a FastList and threads
>> that repeatedly remove the first item they can from the FastList.
>> Failures seem to require simultaneous adds and removes.
>>
>> If I don't nail this problem fairly soon, I may abandon the current,
>> rather complicated, code and switch to writing a concurrent high
>> performance FastList substitute for 1.5 or later.
>>
>> Patricia
>>
> I'll have a look tonight, no promises though ;)

I'm attaching the simplified test application main program that can run, 
and fail, under JRE 1.4, with no need for Junit.

Here's what I think I know. First of all, I have found some dubious 
synchronization situations. However, fixing all the things I have so far 
found of that type has only reduced the failure rate, not eliminated 
failures. That could be caused by changing timings without having any 
impact on the root cause.

The key invariant relates to a thread that is doing a scan, starting 
with a call to head() and proceeding through a series of calls to next() 
to examine nodes. The head() call sets up a guard node for the thread 
that was the tail at some point during the head call. The invariant is 
that the series of next() calls will reach the guard node before finding 
a null next pointer, indicating the actual tail.

The remove call does not really remove anything, it merely marks the 
node removed. Removed nodes are unlinked as a side effect of scans, 
during head and next calls, but only if they are not guard nodes.

There are additional complications in the restart and reap methods, but 
we can ignore them for now - my test does not use them.

Once a guard node is lost, the synchronization breaks down completely, 
because e.g. insertion at tail is protected by synchronization on the 
FastList instance, but unlinking of a removed node in the middle is 
protected, to the extent it is protected at all, by synchronization on 
the FastList.Node instance that is being removed.

The commonest failure symptom is a scan reaching the null next pointer 
at the end of the FIFO during head(), without first finding the guard 
node it just set up. An alternative form of failure is loss of some 
entries - they get added, but the remove threads never find them. The 
second symptom is predominant in the JavaSpaces stress test that got me 
started on this. Messing up a next pointer could cause either.

Incidentally, I'm curious about why it has such a fragile system in 
which the state of a scan is partly tracked by thread, when it seems 
like an obvious candidate for the Iterator pattern. Callers do need to 
be able to find out if a remove call succeeded or not (the node may have 
been removed by another thread), but that could be done in an interface 
extending Iterator. The WeakHashMap in a node that keeps track of the 
threads for which it is a guard would instead track the Iterator. There 
would be no need for thread local storage, the same data could be kept 
in the Iterator.

Thanks for any time you can spend looking at this.

Patricia

Re: Progress, and a problem

Posted by Peter Firmstone <ji...@zeus.net.au>.

Patricia Shanahan wrote:
> On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
> ...
> > The important issue in FastList is that it was written with the JDK1.4
> > memory model. After moving River to Java 1.5, we'd have the JSR166 work
> > and the new, consistent memory model where volatile has a true meaning.
> > However, this code in particular is quite complex as you have noted, so
> > even adjusting to the new memory model could be problematic.
>
> I've just run a modified, simplified version of my test with java 
> version "1.4.2_19" and an unmodified copy of FastList, and I still get 
> the NullPointerException. This changes my thinking a bit. I had been 
> working from the assumption that the issue was to do with the changes 
> in memory model between 1.4 and 1.5. I now have to consider the 
> possibility of a more basic bug that is independent of the memory model.
>
> If there is anyone with a FastList or Java memory model background who 
> would like to help, please reply. I would welcome another set of eyes 
> on the code, and a cross check on my conclusions so far about how 
> FastList is supposed to work. There seems to be a critical invariant 
> that gets broken, and once that happens we are on track to either a 
> NullPointerException or dropped items.
>
> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main 
> program (JDK 1.4 or later0. In both forms, all it does is fire up a 
> mixture of threads that repeatedly add items to a FastList and threads 
> that repeatedly remove the first item they can from the FastList. 
> Failures seem to require simultaneous adds and removes.
>
> If I don't nail this problem fairly soon, I may abandon the current, 
> rather complicated, code and switch to writing a concurrent high 
> performance FastList substitute for 1.5 or later.
>
> Patricia
>
I'll have a look tonight, no promises though ;)

Cheers,

Peter.

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

On 12/3/2010 7:15 AM, Gregg Wonderly wrote:
...
 > The important issue in FastList is that it was written with the JDK1.4
 > memory model. After moving River to Java 1.5, we'd have the JSR166 work
 > and the new, consistent memory model where volatile has a true meaning.
 > However, this code in particular is quite complex as you have noted, so
 > even adjusting to the new memory model could be problematic.

I've just run a modified, simplified version of my test with java 
version "1.4.2_19" and an unmodified copy of FastList, and I still get 
the NullPointerException. This changes my thinking a bit. I had been 
working from the assumption that the issue was to do with the changes in 
memory model between 1.4 and 1.5. I now have to consider the possibility 
of a more basic bug that is independent of the memory model.

If there is anyone with a FastList or Java memory model background who 
would like to help, please reply. I would welcome another set of eyes on 
the code, and a cross check on my conclusions so far about how FastList 
is supposed to work. There seems to be a critical invariant that gets 
broken, and once that happens we are on track to either a 
NullPointerException or dropped items.

I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main 
program (JDK 1.4 or later0. In both forms, all it does is fire up a 
mixture of threads that repeatedly add items to a FastList and threads 
that repeatedly remove the first item they can from the FastList. 
Failures seem to require simultaneous adds and removes.

If I don't nail this problem fairly soon, I may abandon the current, 
rather complicated, code and switch to writing a concurrent high 
performance FastList substitute for 1.5 or later.

Patricia

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

The first conclusion I've reached is that outrigger does have users, and 
so learning enough about it to fix a possible concurrency bug would be 
useful. I want to begin by making the current functionality work, even 
under unrealistic stress. I would also like to see if maintainability 
can be improved with adversely affecting performance.

I have one wish-list item of my own - an outrigger benchmark. As I make 
changes, I need some way to detect and guard against performance 
regression, and possibly find performance gains. As I've already said in 
this mailing list, I'm a non-believer in performance tuning without 
measurement.

We can certainly start a discussion on JavaSpace enhancements. I don't 
promise to be able to implement them, but I'm willing to talk.

Patricia


James Grahn wrote:
> My company also uses Outrigger semi-frequently.
> 
> I didn't speak up when Patricia was asking about the implementation, 
> because I've not dug into the actual code that much.
> 
> That said, most of my aspirations for River revolve around improving 
> Javaspaces.   (Unfortunately with a few compatibility breaks from the 
> existing version.)
> 
> I'll try to pass along my wishlist soon, just so it's on the record even 
> if I don't get around to contributing any code.   If nothing else, it 
> should be useful for discussion.
> 
> jamesG
> 
> On 12/3/2010 2:41 PM, Dennis Reedy wrote:
>> Outrigger is a key service in a project that I am involved in. I see 
>> no reason to remove it from the distribution, it provides a solid 
>> implementation of the JavaSpace and JavaSpace05 interfaces. There are 
>> other services I would vote off the island ;) but thats a topic for a 
>> different thread.
>>
>> Dennis
>>
>> On Dec 3, 2010, at 130PM, Tom Hobbs wrote:
>>
>>>> With River's recent revitalization let's not deprecate anything too
>>>> hastily.
>>>
>>> +1.  You never know, this may yet convince someone who knows about
>>> Outrigger to come back!
>>>
>>> On Fri, Dec 3, 2010 at 6:21 PM, Jeff 
>>> Ramsdale<je...@gmail.com>  wrote:
>>>> On Fri, Dec 3, 2010 at 7:15 AM, Gregg Wonderly<gr...@wonderly.org>  
>>>> wrote:
>>>>
>>>>> Many people are using Dan Creswell's Blitz JavaSpaces 
>>>>> implementation or
>>>>> commercial versions.  I'm partially inclined to suggest that we should
>>>>> discuss EOL of outrigger at some point. Even though Javaspaces is a 
>>>>> large
>>>>> part of what Jini has been recognized for, it has a focused 
>>>>> audience and if
>>>>> we don't have someone with knowledge and interest to support 
>>>>> outrigger, it
>>>>> may be more of a wart than River can deal with.
>>>>
>>>> With River's recent revitalization let's not deprecate anything too
>>>> hastily. Conventional wisdom in Java app development has struggled to
>>>> hold its own against NoSQL and other interesting technologies. It
>>>> could be the perfect opportunity for Jini/JavaSpaces to gain new
>>>> interest.
>>>>
>>>> -jeff
>>>>
>>
>>
>

Re: Progress, and a problem

Posted by James Grahn <jg...@simulexinc.com>.

My company also uses Outrigger semi-frequently.

I didn't speak up when Patricia was asking about the implementation, 
because I've not dug into the actual code that much.

That said, most of my aspirations for River revolve around improving 
Javaspaces.   (Unfortunately with a few compatibility breaks from the 
existing version.)

I'll try to pass along my wishlist soon, just so it's on the record even 
if I don't get around to contributing any code.   If nothing else, it 
should be useful for discussion.

jamesG

On 12/3/2010 2:41 PM, Dennis Reedy wrote:
> Outrigger is a key service in a project that I am involved in. I see no reason to remove it from the distribution, it provides a solid implementation of the JavaSpace and JavaSpace05 interfaces. There are other services I would vote off the island ;) but thats a topic for a different thread.
>
> Dennis
>
> On Dec 3, 2010, at 130PM, Tom Hobbs wrote:
>
>>> With River's recent revitalization let's not deprecate anything too
>>> hastily.
>>
>> +1.  You never know, this may yet convince someone who knows about
>> Outrigger to come back!
>>
>> On Fri, Dec 3, 2010 at 6:21 PM, Jeff Ramsdale<je...@gmail.com>  wrote:
>>> On Fri, Dec 3, 2010 at 7:15 AM, Gregg Wonderly<gr...@wonderly.org>  wrote:
>>>
>>>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
>>>> commercial versions.  I'm partially inclined to suggest that we should
>>>> discuss EOL of outrigger at some point. Even though Javaspaces is a large
>>>> part of what Jini has been recognized for, it has a focused audience and if
>>>> we don't have someone with knowledge and interest to support outrigger, it
>>>> may be more of a wart than River can deal with.
>>>
>>> With River's recent revitalization let's not deprecate anything too
>>> hastily. Conventional wisdom in Java app development has struggled to
>>> hold its own against NoSQL and other interesting technologies. It
>>> could be the perfect opportunity for Jini/JavaSpaces to gain new
>>> interest.
>>>
>>> -jeff
>>>
>
>

Re: Progress, and a problem

Posted by Dennis Reedy <de...@gmail.com>.

Outrigger is a key service in a project that I am involved in. I see no reason to remove it from the distribution, it provides a solid implementation of the JavaSpace and JavaSpace05 interfaces. There are other services I would vote off the island ;) but thats a topic for a different thread.

Dennis

On Dec 3, 2010, at 130PM, Tom Hobbs wrote:

>> With River's recent revitalization let's not deprecate anything too
>> hastily.
> 
> +1.  You never know, this may yet convince someone who knows about
> Outrigger to come back!
> 
> On Fri, Dec 3, 2010 at 6:21 PM, Jeff Ramsdale <je...@gmail.com> wrote:
>> On Fri, Dec 3, 2010 at 7:15 AM, Gregg Wonderly <gr...@wonderly.org> wrote:
>> 
>>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
>>> commercial versions.  I'm partially inclined to suggest that we should
>>> discuss EOL of outrigger at some point. Even though Javaspaces is a large
>>> part of what Jini has been recognized for, it has a focused audience and if
>>> we don't have someone with knowledge and interest to support outrigger, it
>>> may be more of a wart than River can deal with.
>> 
>> With River's recent revitalization let's not deprecate anything too
>> hastily. Conventional wisdom in Java app development has struggled to
>> hold its own against NoSQL and other interesting technologies. It
>> could be the perfect opportunity for Jini/JavaSpaces to gain new
>> interest.
>> 
>> -jeff
>>

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

I've thought about it a bit, and strongly suggest first looking to see 
if there is a reasonably good existing benchmark for JavaSpaces. If 
there is, we should certainly consider it.

Otherwise, I suggest making the benchmark deal with two types of data, 
"reference" and "transaction". Reference data is written rarely, remains 
in the space for a long time, and is frequently read. Groups of 
transaction entries are written by one transaction, and removed when it 
completes. They have short lifetimes.

The benchmark should have parameters that let us shift proportions of 
different entry behavior, numbers of threads, numbers of transactions etc.

I've written a relatively simple unit test for 
com.sun.jini.outrigger.FastList that sometimes gets a null pointer 
exception in it. FastList is the lowest level class involved in storing 
outrigger's entries. As far as I can tell, my use of FastList conforms 
to its interface, and the problem is internal to it.

Patricia

On 12/4/2010 5:50 PM, MICHAEL MCGRADY wrote:
> I think you are more objective than that and would be interested in your suggestion(s).
>
> MG
>
> On Dec 4, 2010, at 7:02 AM, Patricia Shanahan wrote:
>
>> Also, it may be worth searching for JavaSpace benchmarks. I don't want to write or pick the benchmark myself, because it should reflect a users-eye-view of JavaSpaces. I might accidentally bias the benchmark to reflect my current thinking on performance improvement.
>>
>> Patricia
>>
>>
>> On 12/4/2010 6:24 AM, Mike McGrady wrote:
>>> Yes - there is.  Let me noodle this.  i will have to make adjustments, factor in time, etc.
>>>
>>> Sent from my iPhone
>>>
>>> Michael McGrady
>>> Principal investigator AF081_028 SBIR
>>> Chief Architect
>>> Topia Technology, Inc
>>> Work 1.253.572.9712
>>> Cel 1.253.720.3365
>>>
>>> On Dec 3, 2010, at 9:34 PM, Patricia Shanahan<pa...@acm.org>   wrote:
>>>
>>>> MICHAEL MCGRADY wrote:
>>>>> I might jump in here too as Patricia's go to guy or right hand man,
>>>>> or something.  I am interested if Patricia is.  We would have to talk
>>>>> but I think that a River implementation of JavaSpaces is important.
>>>>
>>>> Is there any chance you could write a JavaSpaces benchmark? Perhaps work with other pro-JavaSpaces people on the developer and user mailing lists to make it realistic.
>>>>
>>>> I'm worried that anything I do to fix the bug I'm hunting might slow it down. In the long term, I want to make it faster.
>>>>
>>>> As my next step, I'm planning to try to understand, and document, the principles of operation of the current outrigger implementation.
>>>>
>>>> Patricia
>>>
>>
>
> Michael McGrady
> Chief Architect
> Topia Technology, Inc.
> Cel 1.253.720.3365
> Work 1.253.572.9712 extension 2037
> mmcgrady@topiatechnology.com
>
>
>
>
>
>

Re: Progress, and a problem

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

I think you are more objective than that and would be interested in your suggestion(s).

MG

On Dec 4, 2010, at 7:02 AM, Patricia Shanahan wrote:

> Also, it may be worth searching for JavaSpace benchmarks. I don't want to write or pick the benchmark myself, because it should reflect a users-eye-view of JavaSpaces. I might accidentally bias the benchmark to reflect my current thinking on performance improvement.
> 
> Patricia
> 
> 
> On 12/4/2010 6:24 AM, Mike McGrady wrote:
>> Yes - there is.  Let me noodle this.  i will have to make adjustments, factor in time, etc.
>> 
>> Sent from my iPhone
>> 
>> Michael McGrady
>> Principal investigator AF081_028 SBIR
>> Chief Architect
>> Topia Technology, Inc
>> Work 1.253.572.9712
>> Cel 1.253.720.3365
>> 
>> On Dec 3, 2010, at 9:34 PM, Patricia Shanahan<pa...@acm.org>  wrote:
>> 
>>> MICHAEL MCGRADY wrote:
>>>> I might jump in here too as Patricia's go to guy or right hand man,
>>>> or something.  I am interested if Patricia is.  We would have to talk
>>>> but I think that a River implementation of JavaSpaces is important.
>>> 
>>> Is there any chance you could write a JavaSpaces benchmark? Perhaps work with other pro-JavaSpaces people on the developer and user mailing lists to make it realistic.
>>> 
>>> I'm worried that anything I do to fix the bug I'm hunting might slow it down. In the long term, I want to make it faster.
>>> 
>>> As my next step, I'm planning to try to understand, and document, the principles of operation of the current outrigger implementation.
>>> 
>>> Patricia
>> 
> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

Also, it may be worth searching for JavaSpace benchmarks. I don't want 
to write or pick the benchmark myself, because it should reflect a 
users-eye-view of JavaSpaces. I might accidentally bias the benchmark to 
reflect my current thinking on performance improvement.

Patricia


On 12/4/2010 6:24 AM, Mike McGrady wrote:
> Yes - there is.  Let me noodle this.  i will have to make adjustments, factor in time, etc.
>
> Sent from my iPhone
>
> Michael McGrady
> Principal investigator AF081_028 SBIR
> Chief Architect
> Topia Technology, Inc
> Work 1.253.572.9712
> Cel 1.253.720.3365
>
> On Dec 3, 2010, at 9:34 PM, Patricia Shanahan<pa...@acm.org>  wrote:
>
>> MICHAEL MCGRADY wrote:
>>> I might jump in here too as Patricia's go to guy or right hand man,
>>> or something.  I am interested if Patricia is.  We would have to talk
>>> but I think that a River implementation of JavaSpaces is important.
>>
>> Is there any chance you could write a JavaSpaces benchmark? Perhaps work with other pro-JavaSpaces people on the developer and user mailing lists to make it realistic.
>>
>> I'm worried that anything I do to fix the bug I'm hunting might slow it down. In the long term, I want to make it faster.
>>
>> As my next step, I'm planning to try to understand, and document, the principles of operation of the current outrigger implementation.
>>
>> Patricia
>

Re: Progress, and a problem

Posted by Mike McGrady <mm...@topiatechnology.com>.

Yes - there is.  Let me noodle this.  i will have to make adjustments, factor in time, etc.  

Sent from my iPhone

Michael McGrady
Principal investigator AF081_028 SBIR
Chief Architect
Topia Technology, Inc
Work 1.253.572.9712
Cel 1.253.720.3365

On Dec 3, 2010, at 9:34 PM, Patricia Shanahan <pa...@acm.org> wrote:

> MICHAEL MCGRADY wrote:
>> I might jump in here too as Patricia's go to guy or right hand man,
>> or something.  I am interested if Patricia is.  We would have to talk
>> but I think that a River implementation of JavaSpaces is important.
> 
> Is there any chance you could write a JavaSpaces benchmark? Perhaps work with other pro-JavaSpaces people on the developer and user mailing lists to make it realistic.
> 
> I'm worried that anything I do to fix the bug I'm hunting might slow it down. In the long term, I want to make it faster.
> 
> As my next step, I'm planning to try to understand, and document, the principles of operation of the current outrigger implementation.
> 
> Patricia

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

MICHAEL MCGRADY wrote:
> I might jump in here too as Patricia's go to guy or right hand man,
> or something.  I am interested if Patricia is.  We would have to talk
> but I think that a River implementation of JavaSpaces is important.

Is there any chance you could write a JavaSpaces benchmark? Perhaps work 
with other pro-JavaSpaces people on the developer and user mailing lists 
to make it realistic.

I'm worried that anything I do to fix the bug I'm hunting might slow it 
down. In the long term, I want to make it faster.

As my next step, I'm planning to try to understand, and document, the 
principles of operation of the current outrigger implementation.

Patricia

Re: Progress, and a problem

Posted by MICHAEL MCGRADY <mm...@topiatechnology.com>.

I might jump in here too as Patricia's go to guy or right hand man, or something.  I am interested if Patricia is.  We would have to talk but I think that a River implementation of JavaSpaces is important.

MG


On Dec 3, 2010, at 10:30 AM, Tom Hobbs wrote:

>> With River's recent revitalization let's not deprecate anything too
>> hastily.
> 
> +1.  You never know, this may yet convince someone who knows about
> Outrigger to come back!
> 
> On Fri, Dec 3, 2010 at 6:21 PM, Jeff Ramsdale <je...@gmail.com> wrote:
>> On Fri, Dec 3, 2010 at 7:15 AM, Gregg Wonderly <gr...@wonderly.org> wrote:
>> 
>>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
>>> commercial versions.  I'm partially inclined to suggest that we should
>>> discuss EOL of outrigger at some point. Even though Javaspaces is a large
>>> part of what Jini has been recognized for, it has a focused audience and if
>>> we don't have someone with knowledge and interest to support outrigger, it
>>> may be more of a wart than River can deal with.
>> 
>> With River's recent revitalization let's not deprecate anything too
>> hastily. Conventional wisdom in Java app development has struggled to
>> hold its own against NoSQL and other interesting technologies. It
>> could be the perfect opportunity for Jini/JavaSpaces to gain new
>> interest.
>> 
>> -jeff
>> 

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgrady@topiatechnology.com

Re: Progress, and a problem

Posted by Tom Hobbs <tv...@googlemail.com>.

> With River's recent revitalization let's not deprecate anything too
> hastily.

+1.  You never know, this may yet convince someone who knows about
Outrigger to come back!

On Fri, Dec 3, 2010 at 6:21 PM, Jeff Ramsdale <je...@gmail.com> wrote:
> On Fri, Dec 3, 2010 at 7:15 AM, Gregg Wonderly <gr...@wonderly.org> wrote:
>
>> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
>> commercial versions.  I'm partially inclined to suggest that we should
>> discuss EOL of outrigger at some point. Even though Javaspaces is a large
>> part of what Jini has been recognized for, it has a focused audience and if
>> we don't have someone with knowledge and interest to support outrigger, it
>> may be more of a wart than River can deal with.
>
> With River's recent revitalization let's not deprecate anything too
> hastily. Conventional wisdom in Java app development has struggled to
> hold its own against NoSQL and other interesting technologies. It
> could be the perfect opportunity for Jini/JavaSpaces to gain new
> interest.
>
> -jeff
>

Re: Progress, and a problem

Posted by Jeff Ramsdale <je...@gmail.com>.

On Fri, Dec 3, 2010 at 7:15 AM, Gregg Wonderly <gr...@wonderly.org> wrote:

> Many people are using Dan Creswell's Blitz JavaSpaces implementation or
> commercial versions.  I'm partially inclined to suggest that we should
> discuss EOL of outrigger at some point. Even though Javaspaces is a large
> part of what Jini has been recognized for, it has a focused audience and if
> we don't have someone with knowledge and interest to support outrigger, it
> may be more of a wart than River can deal with.

With River's recent revitalization let's not deprecate anything too
hastily. Conventional wisdom in Java app development has struggled to
hold its own against NoSQL and other interesting technologies. It
could be the perfect opportunity for Jini/JavaSpaces to gain new
interest.

-jeff

Re: Progress, and a problem

Posted by Gregg Wonderly <gr...@wonderly.org>.

On 12/3/2010 1:27 AM, Patricia Shanahan wrote:
> I'm currently hunting an intermittent bug found by the test
> qa/src/com/sun/jini/test/impl/outrigger/matching/StressTestWithShutdown.td
>
> After a failure on Hudson, I modified the .td file to make it fail more often by
> increasing the number of entries (10,000), readers (1000), and writers (1000).
>
> The writers write entries in an OutriggerServerImpl JavaSpace. The readers read,
> and then take, entries that the writers wrote. Sometimes, a reader fails to find
> an entry a writer claims to have written, causing a timeout.
>
> The outrigger implementation depends on the class FastList which seems to use
> the infamous Double Checked Locking idiom
> (http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)
>
> The good news is that any memory model related error in FastList, or the related
> class EntryHolder, would be a plausible cause of the observed symptom. The bad
> news is that FastList and EntryHolder seem to have been written to be very
> aggressively parallel, possibly by someone who was only familiar with
> sequentially consistent memory. :-(

The important issue in FastList is that it was written with the JDK1.4 memory 
model.  After moving River to Java 1.5, we'd have the JSR166 work and the new, 
consistent memory model where volatile has a true meaning.  However, this code 
in particular is quite complex as you have noted, so even adjusting to the new 
memory model could be problematic.

Many people are using Dan Creswell's Blitz JavaSpaces implementation or 
commercial versions.  I'm partially inclined to suggest that we should discuss 
EOL of outrigger at some point. Even though Javaspaces is a large part of what 
Jini has been recognized for, it has a focused audience and if we don't have 
someone with knowledge and interest to support outrigger, it may be more of a 
wart than River can deal with.

> Usually, it is easy to fix a problem once it has been located. This may be a bit
> more difficult, especially because I assume the parallelism is needed for
> acceptable JavaSpace performance.

One of the issues that I've found in network intensive applications, is that the 
latency of communications is so huge compared to code paths, that all active 
threads will fairly quickly end up hovering on top of any use of "synchronized" 
so that there is always the worst case contention for such protected resources.

It's important to understand how to deal with this by either minimizing 
synchronization time, or avoiding funneling kinds of locking mechanisms.

Gregg Wonderly

Re: Progress, and a problem

Posted by Jeff Ramsdale <je...@gmail.com>.

Shay, as Patricia has clearly stated in the past she's doing this as a
hobby, not looking for a production system. I'd rather see you get
involved in the community than try to draw people away. Free is great
(as is non-free--I have nothing against commerce), but Apache River is
open source. How will you contribute?

-jeff

On Sat, Dec 4, 2010 at 11:57 AM, Shay Hassidim <sh...@gigaspaces.com> wrote:
> Patricia,
> I suggest you to try the GigaSpaces implementation.  We solved this issue long time ago.
> We have a free community edition, so you can download it and use it in production.
> Take a look also on our exclusive locking option. You might find it useful.
> Shay
>
> ----- Original Message -----
> From: Patricia Shanahan <pa...@acm.org>
> To: river-dev@incubator.apache.org <ri...@incubator.apache.org>
> Sent: Thu Dec 02 23:27:15 2010
> Subject: Progress, and a problem
>
> I'm currently hunting an intermittent bug found by the test
> qa/src/com/sun/jini/test/impl/outrigger/matching/StressTestWithShutdown.td
>
> After a failure on Hudson, I modified the .td file to make it fail more
> often by increasing the number of entries (10,000), readers (1000), and
> writers (1000).
>
> The writers write entries in an OutriggerServerImpl JavaSpace. The
> readers read, and then take, entries that the writers wrote. Sometimes,
> a reader fails to find an entry a writer claims to have written, causing
> a timeout.
>
> The outrigger implementation depends on the class FastList which seems
> to use the infamous Double Checked Locking idiom
> (http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)
>
> The good news is that any memory model related error in FastList, or the
> related class EntryHolder, would be a plausible cause of the observed
> symptom. The bad news is that FastList and EntryHolder seem to have been
> written to be very aggressively parallel, possibly by someone who was
> only familiar with sequentially consistent memory. :-(
>
> Usually, it is easy to fix a problem once it has been located. This may
> be a bit more difficult, especially because I assume the parallelism is
> needed for acceptable JavaSpace performance.
>
> Patricia
>
>
>

Re: Progress, and a problem

Posted by Shay Hassidim <sh...@gigaspaces.com>.

----- Original Message -----
From: Patricia Shanahan <pa...@acm.org>
To: river-dev@incubator.apache.org <ri...@incubator.apache.org>
Sent: Thu Dec 02 23:27:15 2010
Subject: Progress, and a problem

I'm currently hunting an intermittent bug found by the test 
qa/src/com/sun/jini/test/impl/outrigger/matching/StressTestWithShutdown.td

After a failure on Hudson, I modified the .td file to make it fail more 
often by increasing the number of entries (10,000), readers (1000), and 
writers (1000).

The writers write entries in an OutriggerServerImpl JavaSpace. The 
readers read, and then take, entries that the writers wrote. Sometimes, 
a reader fails to find an entry a writer claims to have written, causing 
a timeout.

The outrigger implementation depends on the class FastList which seems 
to use the infamous Double Checked Locking idiom 
(http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)

The good news is that any memory model related error in FastList, or the 
related class EntryHolder, would be a plausible cause of the observed 
symptom. The bad news is that FastList and EntryHolder seem to have been 
written to be very aggressively parallel, possibly by someone who was 
only familiar with sequentially consistent memory. :-(

Usually, it is easy to fix a problem once it has been located. This may 
be a bit more difficult, especially because I assume the parallelism is 
needed for acceptable JavaSpace performance.

Patricia

Re: Progress, and a problem

Posted by Patricia Shanahan <pa...@acm.org>.

I took a quick look at GigaSpaces, and I don't think it meets the 
minimum requirements for a River outrigger replacement.

A satisfactory replacement must have Apache-like binary licensing for a 
version that is no more restricted than outrigger, so that any River 
user who depends on outrigger now, or might depend on it in the future, 
can use the replacement just as freely. Open source would also be very 
desirable, to enable future River-specific tuning and enhancements.

As far as I can tell, GigaSpaces is a closed source commercial product 
with a community edition that is free but restricted to a single node. 
According to http://www.gigaspaces.com/LatestProductVersion, the 
community edition "... is useful for testing, development and 
small-scale deployments.".

Thanks for the suggestion, but I don't think you understood the 
requirements.

Patricia

On 12/4/2010 11:57 AM, Shay Hassidim wrote:
> Patricia,
> I suggest you to try the GigaSpaces implementation.  We solved this issue long time ago.
> We have a free community edition, so you can download it and use it in production.
> Take a look also on our exclusive locking option. You might find it useful.
> Shay
>
> ----- Original Message -----
> From: Patricia Shanahan<pa...@acm.org>
> To: river-dev@incubator.apache.org<ri...@incubator.apache.org>
> Sent: Thu Dec 02 23:27:15 2010
> Subject: Progress, and a problem
>
> I'm currently hunting an intermittent bug found by the test
> qa/src/com/sun/jini/test/impl/outrigger/matching/StressTestWithShutdown.td
>
> After a failure on Hudson, I modified the .td file to make it fail more
> often by increasing the number of entries (10,000), readers (1000), and
> writers (1000).
>
> The writers write entries in an OutriggerServerImpl JavaSpace. The
> readers read, and then take, entries that the writers wrote. Sometimes,
> a reader fails to find an entry a writer claims to have written, causing
> a timeout.
>
> The outrigger implementation depends on the class FastList which seems
> to use the infamous Double Checked Locking idiom
> (http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)
>
> The good news is that any memory model related error in FastList, or the
> related class EntryHolder, would be a plausible cause of the observed
> symptom. The bad news is that FastList and EntryHolder seem to have been
> written to be very aggressively parallel, possibly by someone who was
> only familiar with sequentially consistent memory. :-(
>
> Usually, it is easy to fix a problem once it has been located. This may
> be a bit more difficult, especially because I assume the parallelism is
> needed for acceptable JavaSpace performance.
>
> Patricia
>
>
>

Re: Progress, and a problem

Posted by Shay Hassidim <sh...@gigaspaces.com>.

Patricia,
I suggest you to try the GigaSpaces implementation.  We solved this issue long time ago.
We have a free community edition, so you can download it and use it in production.
Take a look also on our exclusive locking option. You might find it useful.
Shay 

----- Original Message -----
From: Patricia Shanahan <pa...@acm.org>
To: river-dev@incubator.apache.org <ri...@incubator.apache.org>
Sent: Thu Dec 02 23:27:15 2010
Subject: Progress, and a problem

I'm currently hunting an intermittent bug found by the test 
qa/src/com/sun/jini/test/impl/outrigger/matching/StressTestWithShutdown.td

After a failure on Hudson, I modified the .td file to make it fail more 
often by increasing the number of entries (10,000), readers (1000), and 
writers (1000).

The writers write entries in an OutriggerServerImpl JavaSpace. The 
readers read, and then take, entries that the writers wrote. Sometimes, 
a reader fails to find an entry a writer claims to have written, causing 
a timeout.

The outrigger implementation depends on the class FastList which seems 
to use the infamous Double Checked Locking idiom 
(http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html)

The good news is that any memory model related error in FastList, or the 
related class EntryHolder, would be a plausible cause of the observed 
symptom. The bad news is that FastList and EntryHolder seem to have been 
written to be very aggressively parallel, possibly by someone who was 
only familiar with sequentially consistent memory. :-(

Usually, it is easy to fix a problem once it has been located. This may 
be a bit more difficult, especially because I assume the parallelism is 
needed for acceptable JavaSpace performance.

Patricia