You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by David Hausladen <da...@yahoo.com> on 2012/05/07 21:03:47 UTC

BTreeManager - opt in or automatic?

Hi,

I have some questions about BTreeManager and how it is used to solve the problem of large sets of child nodes. We're converting a legacy document store into Jackrabbit, preserving file paths. Unfortunately, we found that this legacy store was very flat in some cases and when that is the case, performance suffers, in some cases dramatically. A few searches turned up BTreeManager. But there's very little documentation of how to use it.

1.) First of all, is it automatically used by Jackrabbit to manage the children of all nodes, or is it an opt-in feature?
2.) If it is an opt-in feature, does one need to always use ItemSequence to access the children of a node?  Based on what I see of the code, it seems necessary since without the mapping performed by ItemSequence, the backing nodes' getNodes methods would return the internal (hierarchical) rather than the flat structure.  

Thanks,
Dave

Re: BTreeManager - opt in or automatic?

Posted by Michael Dürig <md...@apache.org>.

On 15.5.12 18:26, David Hausladen wrote:
> Added https://issues.apache.org/jira/browse/JCR-3311 as suggested.

Thanks, I'll have a look when time permits.
Michael
>
> --
> View this message in context: http://jackrabbit.510166.n4.nabble.com/BTreeManager-opt-in-or-automatic-tp4615575p4634937.html
> Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.

Re: BTreeManager - opt in or automatic?

Posted by David Hausladen <da...@yahoo.com>.
Added https://issues.apache.org/jira/browse/JCR-3311 as suggested.

--
View this message in context: http://jackrabbit.510166.n4.nabble.com/BTreeManager-opt-in-or-automatic-tp4615575p4634937.html
Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.

Re: BTreeManager - opt in or automatic?

Posted by Michael Dürig <md...@apache.org>.
On 8.5.12 15:15, David Hausladen wrote:
> I've attached the class as it stands (currently untested).  The main struggle is how to handle the NodeIterator's getPosition method (I'm taking the easy way out throwing an exception). Also worth mentioning: I'm on 2.2.5.

This looks pretty good to me. Wrt. getPosition method, why don't you 
just count the number of calls to next() in AdaptingNodeIterator?

It would be great if you could attach this code as a patch to a JIRA 
issue for jcr-commons. I'll could then try to incorporate it into trunk.

Michael


>
>
> ----- Original Message -----
> From: Jukka Zitting<ju...@gmail.com>
> To: dev@jackrabbit.apache.org; David Hausladen<da...@yahoo.com>
> Cc:
> Sent: Tuesday, May 8, 2012 3:48 AM
> Subject: Re: BTreeManager - opt in or automatic?
>
> Hi,
>
> On Tue, May 8, 2012 at 3:51 AM, David Hausladen
> <da...@yahoo.com>  wrote:
>> Under the assumption that it's opt in, I started an implementation of the Node interface that
>> adapts to the ItemSequence's NodeSequence (to minimize visibility of the
>> org.apache.jackrabbit.commons.flat classes).
>
> If you like, I'd love to incorporate something like that in
> jackrabbit-jcr-commons.
>
>> The fact that NodeSequence returns Iterator<javax.jcr.Node>  instead of
>> javax.jcr.NodeIterator makes it impossible.
>
> Use the Adapter pattern.
>
> BR,
>
> Jukka Zitting

Re: BTreeManager - opt in or automatic?

Posted by David Hausladen <da...@yahoo.com>.
I've attached the class as it stands (currently untested).  The main struggle is how to handle the NodeIterator's getPosition method (I'm taking the easy way out throwing an exception). Also worth mentioning: I'm on 2.2.5. 


----- Original Message -----
From: Jukka Zitting <ju...@gmail.com>
To: dev@jackrabbit.apache.org; David Hausladen <da...@yahoo.com>
Cc: 
Sent: Tuesday, May 8, 2012 3:48 AM
Subject: Re: BTreeManager - opt in or automatic?

Hi,

On Tue, May 8, 2012 at 3:51 AM, David Hausladen
<da...@yahoo.com> wrote:
> Under the assumption that it's opt in, I started an implementation of the Node interface that
> adapts to the ItemSequence's NodeSequence (to minimize visibility of the
> org.apache.jackrabbit.commons.flat classes).

If you like, I'd love to incorporate something like that in
jackrabbit-jcr-commons.

> The fact that NodeSequence returns Iterator<javax.jcr.Node> instead of
> javax.jcr.NodeIterator makes it impossible.

Use the Adapter pattern.

BR,

Jukka Zitting

Re: BTreeManager - opt in or automatic?

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, May 8, 2012 at 3:51 AM, David Hausladen
<da...@yahoo.com> wrote:
> Under the assumption that it's opt in, I started an implementation of the Node interface that
> adapts to the ItemSequence's NodeSequence (to minimize visibility of the
> org.apache.jackrabbit.commons.flat classes).

If you like, I'd love to incorporate something like that in
jackrabbit-jcr-commons.

> The fact that NodeSequence returns Iterator<javax.jcr.Node> instead of
> javax.jcr.NodeIterator makes it impossible.

Use the Adapter pattern.

BR,

Jukka Zitting

Re: BTreeManager - opt in or automatic?

Posted by David Hausladen <da...@yahoo.com>.
Under the assumption that it's opt in, I started an implementation of the Node interface that adapts to the ItemSequence's NodeSequence (to minimize visibility of the org.apache.jackrabbit.commons.flat classes).  The fact that NodeSequence returns Iterator<javax.jcr.Node> instead of javax.jcr.NodeIterator makes it impossible.


----- Forwarded Message -----
From: David Hausladen <da...@yahoo.com>
To: ""dev@jackrabbit.apache.org"" <de...@jackrabbit.apache.org>
Cc: 
Sent: Monday, May 7, 2012 3:03 PM
Subject: BTreeManager - opt in or automatic?

Hi,

I have some questions about BTreeManager and how it is used to solve the problem of large sets of child nodes. We're converting a legacy document store into Jackrabbit, preserving file paths. Unfortunately, we found that this legacy store was very flat in some cases and when that is the case, performance suffers, in some cases dramatically. A few searches turned up BTreeManager. But there's very little documentation of how to use it.

1.) First of all, is it automatically used by Jackrabbit to manage the children of all nodes, or is it an opt-in feature?
2.) If it is an opt-in feature, does one need to always use ItemSequence to access the children of a node?  Based on what I see of the code, it seems necessary since without the mapping performed by ItemSequence, the backing nodes' getNodes methods would return the internal (hierarchical) rather than the flat structure.  

Thanks,
Dave


Re: BTreeManager - opt in or automatic?

Posted by Michael Dürig <md...@apache.org>.

On 7.5.12 20:03, David Hausladen wrote:
> Hi,
>
> I have some questions about BTreeManager and how it is used to solve the problem of large sets of child nodes. We're converting a legacy document store into Jackrabbit, preserving file paths. Unfortunately, we found that this legacy store was very flat in some cases and when that is the case, performance suffers, in some cases dramatically. A few searches turned up BTreeManager. But there's very little documentation of how to use it.
>
> 1.) First of all, is it automatically used by Jackrabbit to manage the children of all nodes, or is it an opt-in feature?

It is an opt-in feature. Jackrabbit only uses it in the UserManager 
implementation for managing groups with many members.

> 2.) If it is an opt-in feature, does one need to always use ItemSequence to access the children of a node?  Based on what I see of the code, it seems necessary since without the mapping performed by ItemSequence, the backing nodes' getNodes methods would return the internal (hierarchical) rather than the flat structure.

Basically yes. TreeManager instances are responsible for mapping 
sequential structures to tree structures. The BTreeManager uses a BTree 
for that. NodeSequence and PropertySequence provide means for accessing 
nodes in such trees as though they are a sequential structures. If you 
access the underlying tree directly, you will see the internal BTree 
structure.

Michael



>
> Thanks,
> Dave