You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cayenne.apache.org by Øyvind Harboe <oy...@zylin.com> on 2008/06/20 15:46:06 UTC

weakly referenced paged queries

I'm toying around with weak references in paged queries.

This turned out to be rather hairy from the point of view that
a query is supposed to return a List and this List has
to live up to a lot of expectations from the calling code +
promises of the List interfaces.



Consider two of the subList implementations in IncrementalFaultList:

a)

    public List<E> subList(int fromIndex, int toIndex) {
        synchronized (elements) {
            resolveInterval(fromIndex, toIndex);
            return elements.subList(fromIndex, toIndex);
        }
    }

b)

    public List<E> subList(int fromIndex, int toIndex) {
    	List<E> l=new LinkedList<E>();
    	for (int i=fromIndex; i<toIndex; i++)
    	{
    		l.add(get(i));
    	}
    	return l;
    }


1. JavaDoc states that list.subList(from, to).clear() will clear that
range.  Here b) is broken.

2. I was greatly surprised to see synchronized() for the list returned
from the query. Is the synchronisation for the benefit of the
calling code or the implementation?


Attached is the work in progress. First and foremost I'm after measuring
the performance of such an implementation. So far it doesn't look very
promising. It does use less memory but it is dog slow. Will tinker with it more.


-- 
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer

Re: weakly referenced paged queries

Posted by Lachlan Deck <la...@gmail.com>.
On 23/06/2008, at 4:43 PM, Øyvind Harboe wrote:

> On Sun, Jun 22, 2008 at 10:28 PM, Andrus Adamchik
> <an...@objectstyle.org> wrote:
>>
>> On Jun 20, 2008, at 10:41 PM, Øyvind Harboe wrote:
>>
>>> - the JavaDoc explained *why* the synchronization is there and  
>>> what it
>>> is supposed to do and how the client is supposed to use it.
>>
>> From the class JavaDocs: "A synchronized list that serves as a  
>> container of
>> DataObjects"... A Collection is either synchronized or not. No other
>> explanation is needed.
>
> That's what I thought, until I saw this construct:
>
> someList.subList(a,b).clear();
>
> a) what does the above do?

@see java.util.AbstractList.sublist(int,int)

with regards,
--

Lachlan Deck




Re: weakly referenced paged queries

Posted by Øyvind Harboe <oy...@zylin.com>.
On Sun, Jun 22, 2008 at 10:28 PM, Andrus Adamchik
<an...@objectstyle.org> wrote:
>
> On Jun 20, 2008, at 10:41 PM, Øyvind Harboe wrote:
>
>> - the JavaDoc explained *why* the synchronization is there and what it
>> is supposed to do and how the client is supposed to use it.
>
> From the class JavaDocs: "A synchronized list that serves as a container of
> DataObjects"... A Collection is either synchronized or not. No other
> explanation is needed.

That's what I thought, until I saw this construct:

someList.subList(a,b).clear();

a) what does the above do?

b) is it broken if it requires synchronisation?

It is possible to implement a list such that the above does not require
the caller to be aware whether or not synchronisation is required.

> Now regarding the synchronization scope, my current POV (that is somewhat
> different from my past opinion) is that the list methods should NOT be
> synchronized by default

Glad to hear this. :-)

> (i.e. users should create synchronized wrappers if
> they need to). This way IncrementalFaultList will be aligned with standard
> Java collections. One thing that still must be synchronized internally is
> the DB operations to avoid multiple concurrent fetches of the same data.
> This does not require synchronization of most normal list access methods.

Is the synchronisation mechanism for IncrementalFaultList required for this
purpose?

I.e. does the *implementation* require it even if the caller does not? I didn't
find any evidence of that.

> I.e. the scope of synchronization should be significantly reduced. (We can
> do something extra fancy with syncing DB ops per page to allow parallel page
> faulting, but that's certainly not a very high priority task).
>
> Now your mention of performance issues... My guess without looking at the
> code, this is unlikely to happen due to oversynchronization. Compared to
> other things that are going on, synchronization overhead should be
> negligible. Of course running in profiler should answer that for sure.

The performance problem would arise if multiple CPUs where blocking on access
to the same list.  Obviously multiple CPUs would have to work on *different*
data not to require synchronisation. At that point the synchronisation
mechanism no longer serves any purpose.

My point is simply that the synchronisation code in IncrementalFaultList
does not help the user nor does it make the code any clearer + it is potentially
broken anyway.

Is it tested?

Code that isn't tested, doesn't work. Murphys law.

Code that is tested, doesn't necessarily work either. Murphys law.

The only difference between the two is that code that isn't tested, is
guaranteed not to work. If you spend the time to write the code that
tests it, you've wasted the time because it then works. Murphys law.

The only way you can win is if you delete the code. :-)

> Finally thanks for opening CAY-1075. One request - could you submit any code
> as patches instead of full files - it will make it much easier to analyze
> the changes.

Once I have something useful, I'll create the patch. For now I use Jira as
a file store. The way I work is simply to copy & paste the Cayenne .java
file into my app and fetch the rest from Cayenne. I don't actually compile
Cayenne when I hack it.


-- 
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer

Re: weakly referenced paged queries

Posted by Andrus Adamchik <an...@objectstyle.org>.
On Jun 20, 2008, at 10:41 PM, Øyvind Harboe wrote:

> - the JavaDoc explained *why* the synchronization is there and what it
> is supposed to do and how the client is supposed to use it.

 From the class JavaDocs: "A synchronized list that serves as a  
container of DataObjects"... A Collection is either synchronized or  
not. No other explanation is needed.

Now regarding the synchronization scope, my current POV (that is  
somewhat different from my past opinion) is that the list methods  
should NOT be synchronized by default (i.e. users should create  
synchronized wrappers if they need to). This way IncrementalFaultList  
will be aligned with standard Java collections. One thing that still  
must be synchronized internally is the DB operations to avoid multiple  
concurrent fetches of the same data. This does not require  
synchronization of most normal list access methods. I.e. the scope of  
synchronization should be significantly reduced. (We can do something  
extra fancy with syncing DB ops per page to allow parallel page  
faulting, but that's certainly not a very high priority task).

Now your mention of performance issues... My guess without looking at  
the code, this is unlikely to happen due to oversynchronization.  
Compared to other things that are going on, synchronization overhead  
should be negligible. Of course running in profiler should answer that  
for sure.

Finally thanks for opening CAY-1075. One request - could you submit  
any code as patches instead of full files - it will make it much  
easier to analyze the changes.

Andrus




Re: weakly referenced paged queries

Posted by Øyvind Harboe <oy...@zylin.com>.
On Fri, Jun 20, 2008 at 11:57 PM, Scott Anderson <sa...@airvana.com> wrote:
> Why do you expect there to be a comment justifying it? It's necessary to lock all
>  collections while iterating over or modifying them if they can be vulnerable to concurrent
> modification.

>From the clients point of view there is only a single thread.

What other threads are there?

Also, the type of locking used must be documented. According to the
collections framework, it won't "just work" without explicitly synchronizing.




-- 
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer

RE: weakly referenced paged queries

Posted by Scott Anderson <sa...@airvana.com>.
Why do you expect there to be a comment justifying it? It's necessary to lock all collections while iterating over or modifying them if they can be vulnerable to concurrent modification.

-----Original Message-----
From: oyvindharboe@gmail.com [mailto:oyvindharboe@gmail.com] On Behalf Of Øyvind Harboe
Sent: Friday, June 20, 2008 3:42 PM
To: user@cayenne.apache.org
Subject: Re: weakly referenced paged queries

On Fri, Jun 20, 2008 at 7:34 PM, Scott Anderson <sa...@airvana.com> wrote:
> It looks like the synchronized block is designed to prevent concurrent modification
>  of the list. I don't believe that particular code would suffer the consequences
> if concurrent modification occurred, but it's generally best to synchronize
> list access when in doubt, since even a simple iteration can be the offender.

I find that code that is added because one hasn't defined what the
interface should be, is trouble waiting to happen.

I would be much more comfortable, if either:

- the JavaDoc explained *why* the synchronization is there and what it
is supposed to do and how the client is supposed to use it.
- it serves some internal purpose and that purpose was stated as a comment
in the code
- the synchronization code was deleted(faster + less chance of deadlock).

-- 
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer

Re: weakly referenced paged queries

Posted by Øyvind Harboe <oy...@zylin.com>.
On Fri, Jun 20, 2008 at 7:34 PM, Scott Anderson <sa...@airvana.com> wrote:
> It looks like the synchronized block is designed to prevent concurrent modification
>  of the list. I don't believe that particular code would suffer the consequences
> if concurrent modification occurred, but it's generally best to synchronize
> list access when in doubt, since even a simple iteration can be the offender.

I find that code that is added because one hasn't defined what the
interface should be, is trouble waiting to happen.

I would be much more comfortable, if either:

- the JavaDoc explained *why* the synchronization is there and what it
is supposed to do and how the client is supposed to use it.
- it serves some internal purpose and that purpose was stated as a comment
in the code
- the synchronization code was deleted(faster + less chance of deadlock).

-- 
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer

RE: weakly referenced paged queries

Posted by Scott Anderson <sa...@airvana.com>.
It looks like the synchronized block is designed to prevent concurrent modification of the list. I don't believe that particular code would suffer the consequences if concurrent modification occurred, but it's generally best to synchronize list access when in doubt, since even a simple iteration can be the offender.

-----Original Message-----
From: oyvindharboe@gmail.com [mailto:oyvindharboe@gmail.com] On Behalf Of Øyvind Harboe
Sent: Friday, June 20, 2008 9:46 AM
To: cayenne-user@incubator.apache.org
Subject: weakly referenced paged queries

I'm toying around with weak references in paged queries.

This turned out to be rather hairy from the point of view that
a query is supposed to return a List and this List has
to live up to a lot of expectations from the calling code +
promises of the List interfaces.



Consider two of the subList implementations in IncrementalFaultList:

a)

    public List<E> subList(int fromIndex, int toIndex) {
        synchronized (elements) {
            resolveInterval(fromIndex, toIndex);
            return elements.subList(fromIndex, toIndex);
        }
    }

b)

    public List<E> subList(int fromIndex, int toIndex) {
    	List<E> l=new LinkedList<E>();
    	for (int i=fromIndex; i<toIndex; i++)
    	{
    		l.add(get(i));
    	}
    	return l;
    }


1. JavaDoc states that list.subList(from, to).clear() will clear that
range.  Here b) is broken.

2. I was greatly surprised to see synchronized() for the list returned
from the query. Is the synchronisation for the benefit of the
calling code or the implementation?


Attached is the work in progress. First and foremost I'm after measuring
the performance of such an implementation. So far it doesn't look very
promising. It does use less memory but it is dog slow. Will tinker with it more.


-- 
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer