You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openjpa.apache.org by "Roytman, Alex" <Ro...@peacetech.com> on 2006/10/06 00:52:53 UTC

Proposal: Optimizing empty collection fetch. Meta Column in ContainerFieldMappling

Hello Abe,

I would like to present a valid use case and a very useful performance
enhancement.

The idea is that, if we know that a collection field is empty there is
no need to fetch it.

It can provide a truly dramatic performance improvement when in a large
set of instance only some of them have non-empty collection field.
Consider a very common case - composite (tree like) data structures.
Unlike true composite pattern typical tree structure does not have a
special leaf class that is any node of a tree can potentially have
sub-nodes. When traversing such a tree as many as 70% of fetches of
child nodes will yield empty collection because obviously leaf level is
the larges in a tree structure :-)  

I wrote a prototype custom 1-N mapping which allow to store "empty" flag
(whether the collection is empty) on commit and will store empty
collection into StateManager on collection field load if the flag is set
to true (empty) instead of going to database to fetch it.

The results were dramatic - when traversing 800-node tree number of
"fetch-sub-nodes" SQL statements was cut from 800 to 130.

Non-Tree cases when objects have sparsely populated collection field can
be even more dramatic.

If concurrency of the collection field is controlled on owned class
level (default) I think there is no dander of this flag being out of
synch with actual collection content without entering concurrent
modification state.

I have not had chance to think through transaction commit implications
if any.

There is a very nice facility in ContainerFieldMappling for indicating
null container fields. I wonder why it so much hard wired to empty/null
and does not allow non-empty/empty/null differentiation and
optimization.
Any reason it is so restrictive? Any plans to make it a bit more
flexible or directly implementing the behavior I outlined above?

I would greatly appreciate if you could comment on this and may be
suggest the best approach implementing this. Or may be it is already
implemented and I am missing it :-)

Best Regards

Alex Roytman
Peace Technology, Inc



Re: Proposal: Optimizing empty collection fetch. Meta Column in ContainerFieldMappling

Posted by Marc Prud'hommeaux <mp...@apache.org>.
Alex-

That does sound like a good feature to add. Note that I think the  
"null-indicator" attribute is only available for embedded mappings,  
not for container mappings (although I could be wrong about this).

I'd recommend opening a JIRA issue as a reference for the enhancement  
request, and we can build on that.



On Oct 5, 2006, at 3:52 PM, Roytman, Alex wrote:

> Hello Abe,
>
> I would like to present a valid use case and a very useful performance
> enhancement.
>
> The idea is that, if we know that a collection field is empty there is
> no need to fetch it.
>
> It can provide a truly dramatic performance improvement when in a  
> large
> set of instance only some of them have non-empty collection field.
> Consider a very common case - composite (tree like) data structures.
> Unlike true composite pattern typical tree structure does not have a
> special leaf class that is any node of a tree can potentially have
> sub-nodes. When traversing such a tree as many as 70% of fetches of
> child nodes will yield empty collection because obviously leaf  
> level is
> the larges in a tree structure :-)
>
> I wrote a prototype custom 1-N mapping which allow to store "empty"  
> flag
> (whether the collection is empty) on commit and will store empty
> collection into StateManager on collection field load if the flag  
> is set
> to true (empty) instead of going to database to fetch it.
>
> The results were dramatic - when traversing 800-node tree number of
> "fetch-sub-nodes" SQL statements was cut from 800 to 130.
>
> Non-Tree cases when objects have sparsely populated collection  
> field can
> be even more dramatic.
>
> If concurrency of the collection field is controlled on owned class
> level (default) I think there is no dander of this flag being out of
> synch with actual collection content without entering concurrent
> modification state.
>
> I have not had chance to think through transaction commit implications
> if any.
>
> There is a very nice facility in ContainerFieldMappling for indicating
> null container fields. I wonder why it so much hard wired to empty/ 
> null
> and does not allow non-empty/empty/null differentiation and
> optimization.
> Any reason it is so restrictive? Any plans to make it a bit more
> flexible or directly implementing the behavior I outlined above?
>
> I would greatly appreciate if you could comment on this and may be
> suggest the best approach implementing this. Or may be it is already
> implemented and I am missing it :-)
>
> Best Regards
>
> Alex Roytman
> Peace Technology, Inc
>
>