You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Stefan Hagedorn <na...@gmx.de> on 2010/10/07 15:19:30 UTC

Suggestions for node hierarchy?

Hi, 

I am pretty new to Jackrabbit and while thinking about how to organize my content in the repo, I was wondering if the structure/hierarchy of the nodes has an (significant) impact on indexing and searching. 

I was thinking about organizing my content in files and folders, because I read somewhere that a deep hierarchy is better than a flat one, where every content node is a direct child of the root node. 

In a simple benchmark test I wasn't able to figure out any differences of execution time for xpath queries between both variants.

Does anybody have any hints?

Thanks in advance,
Stefan
-- 
GRATIS! Movie-FLAT mit über 300 Videos. 
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome

Re: Suggestions for node hierarchy?

Posted by Stefan Hagedorn <na...@gmx.de>.
Hi Bertrand,

thank you for your response. It really helped a lot.


Stefan

On 07.10.2010 17:54, Bertrand Delacretaz wrote:
> Hi,
>
> On Thu, Oct 7, 2010 at 3:19 PM, Stefan Hagedorn<na...@gmx.de>  wrote:
>> ...I am pretty new to Jackrabbit and while thinking about how to organize my content in the repo, I was
>> wondering if the structure/hierarchy of the nodes has an (significant) impact on indexing and searching....
>
> I don't have deep knowledge of performance issues, but I don't think so.
>
> The only limitation that I'm aware of is that a node should not have
> more than N child nodes, N being around 10'000 last time I checked.
> And much less than that helps when a human is looking at your tree ;-)
>
> OTOH the node hierarchy has a big impact on how
> self-explaining/hackable (in the noble sense) your system is.
>
> http://wiki.apache.org/jackrabbit/DavidsModel has a few pointers, for
> the rest I'd say think of what you would do in a unixish filesystem,
> on steroids.
>
>>
>> ...I was thinking about organizing my content in files and folders, because I read somewhere that a deep
>> hierarchy is better than a flat one, where every content node is a direct child of the root node....
>
> File and folders can work well, and you might want to start with a few
> subdivisions under the root node, again something like a unixish
> filesystem (/libs for code, /content, /etc for configs, /tmp), etc.
>
> My basic benchmark for a node structure is "can someone figure out
> what this is by just looking at the tree of nodes and properties" - I
> think making this self-explaining and logical is the beauty of JCR.
> This includes self-explaining "local micro-trees" under your main
> pieces of content: pages, business objects, whatever.
>
> I also (shameless plug) wrote a blog post about this at
> http://dev.day.com/content/ddc/blog/2009/04/cq5tags.html, might help.
>
>>
>> ...In a simple benchmark test I wasn't able to figure out any differences of execution time for xpath queries
>> between both variants....
>
> Matches my experience.
>
> -Bertrand


Re: Suggestions for node hierarchy?

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Thu, Oct 7, 2010 at 3:19 PM, Stefan Hagedorn <na...@gmx.de> wrote:
> ...I am pretty new to Jackrabbit and while thinking about how to organize my content in the repo, I was
> wondering if the structure/hierarchy of the nodes has an (significant) impact on indexing and searching....

I don't have deep knowledge of performance issues, but I don't think so.

The only limitation that I'm aware of is that a node should not have
more than N child nodes, N being around 10'000 last time I checked.
And much less than that helps when a human is looking at your tree ;-)

OTOH the node hierarchy has a big impact on how
self-explaining/hackable (in the noble sense) your system is.

http://wiki.apache.org/jackrabbit/DavidsModel has a few pointers, for
the rest I'd say think of what you would do in a unixish filesystem,
on steroids.

>
> ...I was thinking about organizing my content in files and folders, because I read somewhere that a deep
> hierarchy is better than a flat one, where every content node is a direct child of the root node....

File and folders can work well, and you might want to start with a few
subdivisions under the root node, again something like a unixish
filesystem (/libs for code, /content, /etc for configs, /tmp), etc.

My basic benchmark for a node structure is "can someone figure out
what this is by just looking at the tree of nodes and properties" - I
think making this self-explaining and logical is the beauty of JCR.
This includes self-explaining "local micro-trees" under your main
pieces of content: pages, business objects, whatever.

I also (shameless plug) wrote a blog post about this at
http://dev.day.com/content/ddc/blog/2009/04/cq5tags.html, might help.

>
> ...In a simple benchmark test I wasn't able to figure out any differences of execution time for xpath queries
> between both variants....

Matches my experience.

-Bertrand