You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Amit Jain <am...@apache.org> on 2014/12/10 12:16:19 UTC

[Discuss] Indexing best practices/recommendations.

Hi,

With the recent addition of LucenePropertyIndex we can create a lucene
index on a specific type with multiple properties indexed. The queries with
full-text like below are served well by this index and help limit the index
size
* /jcr:root/a/b//element(*, asset)[(jcr:contains(., 'foo'))] order by
@jcr:lastModified

But for generic queries on nt:base like /jcr:root/a/b//*[jcr:contains(.,
'foo')] order by @jcr:lastModified another index needs to be created. What
should be the correct option for creating such an index?
Some options that come to mind are:
* Create a lucene-property index under the /a/b
* Augment the lucene index to also index jcr:lastModified

Option 1 looks best to me but we might ultimately end up in a situation
where we many indexes defined to cater for specific use cases.


Thanks
Amit

Re: [Discuss] Indexing best practices/recommendations.

Posted by Davide Giannella <da...@apache.org>.
On 10/12/2014 11:16, Amit Jain wrote:
> Hi,
>
> With the recent addition of LucenePropertyIndex we can create a lucene
> index on a specific type with multiple properties indexed. The queries with
> full-text like below are served well by this index and help limit the index
> size
> * /jcr:root/a/b//element(*, asset)[(jcr:contains(., 'foo'))] order by
> @jcr:lastModified
>
> But for generic queries on nt:base like /jcr:root/a/b//*[jcr:contains(.,
> 'foo')] order by @jcr:lastModified another index needs to be created. What
> should be the correct option for creating such an index?
> Some options that come to mind are:
> * Create a lucene-property index under the /a/b
> * Augment the lucene index to also index jcr:lastModified
>
> Option 1 looks best to me but we might ultimately end up in a situation
> where we many indexes defined to cater for specific use cases.
>
It depends on what's the overhead of creating a new index rather than
expanding the existing one with a new property. I didn't really
understood what was indexed and what not in your examples though.

Adding more indexes should not be an issue, as with all DB you add index
when you need it; and Oakcan be considered data base :)

My suggestion is that if you have only a query like

/jcr:root/a/b//*[jcr:contains(.,
'foo')] order by @jcr:lastModified

you can create an index under /a/b but if you start seeing more and more queries on different paths, there will be a point where it's rather better to have a generic index under root. Or you prefer query performances over size in which case I would go for multiple indexes.

anyhow we didn't really test the use case of many many many indexes and how it could affect the queries.

Cheers
Davide