You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Øyvind Stegard <oy...@usit.uio.no> on 2008/03/12 20:57:50 UTC

JCR 1.0 SQL queries with JackRabbit

Hello,

We are currently in the process of evaluating JackRabbit as a backend
content repository for our CMS, using the JCR 1.0 API (JackRabbit is
version 1.4.1).

I have some questions regarding repository queries. I am using the SQL
syntax.


1) 
I am having difficulties with some queries which are based on the
jcr:path pseudo property:

Descendant or self (similar example found on page 305 of the JCR 1.0
spec):
SELECT * FROM nt:base WHERE jcr:path = '/foo' OR jcr:path LIKE '/foo/%'

Here I get the following error: Invalid combination of jcr:path clauses

The problem seems to be general, in that I am unable to combine multiple
jcr:path clauses in a single query (with a few exceptions, see question
2). Even a simple expression like "jcr:path = '/foo' OR jcr:path =
'/bar'" fails.

If this is a known limitation in JackRabbit 1.4, will such queries be
possible in JCR 2.0, using the SQL2 syntax or QOM ? I would certainly
prefer using a qom directly.


2)
Is there any feasible way of querying for path depth using the jcr:path
pseudo property ? When talking about "path depth", I am only concerned
about absolute node paths, where "/" has depth 0, "/foo" has depth 1 and
so on. It is the same as the depth concept in
org.apache.jackrabbit.spi.Path#getDepth(), but only for absolute paths. 

For instance, for a query that lists file/folder nodes at exactly depth
1, i.e. the children of the root node, one would think something like
this works**:

SELECT * FROM nt:base WHERE (jcr:primaryType = 'nt:file' OR
jcr:primaryType = 'nt:folder') AND jcr:path LIKE '/%' AND NOT jcr:path
LIKE '/%/%'

**nt:file and nt:folder only used as example, in reality we have our own
custom types representing the file/folder concept.

And in fact, this particular query works as expected (even though it
contains multiple jcr:path clauses). It is just like the "Child" SQL
query example on page 305 of the JCR 1.0 spec.

Generalising this approach for depth-N, however, seems to not work, for
instance for depth 2:

SELECT * FROM nt:base WHERE (jcr:primaryType = 'nt:file' OR
jcr:primaryType = 'nt:folder') AND jcr:path LIKE '/%/%' AND NOT jcr:path
LIKE '/%/%/%'

This one returns a result set containing every nt:file/nt:folder-node in
the repository with depth greater than 1 (we only want exactly 2). Is
this kind of query possible at all when based on jcr:path ?

I could add an integer property containing the path depth for every node
which represents a resource in our CMS, and use that property in
queries. But that property value would be a direct function of jcr:path,
and would need updating whenever jcr:path changes, that's something I do
*not* want to do.

Supporting path depth clauses in queries might be considered somewhat
eccentric, but it is something we currently support in our CMS with our
own backend query implementation. Some general background information
about our CMS is provided at the end of this email.


3)
It seems that logical inversion of LIKE queries on jcr:path is ignored
by JackRabbit.

Two examples:

SELECT * FROM nt:base WHERE jcr:path NOT LIKE '/foo/%'
and
SELECT * FROM nt:base WHERE NOT (jcr:path LIKE '/foo/%')

returns exactly the same result set as:

SELECT * FROM nt:base WHERE jcr:path LIKE '/foo/%'

So the "NOT" part seems to be completely ignored. I suspect this
limitation is because of performance-concerns and the Lucene-based query
backend ?


4)
It seems that "NOT LIKE .." generally does not invert matching for any
kind of JCR property ? However, I could work around that by wrapping a
regular "LIKE" inside a NOT(...) expression, i.e. "NOT (foo:bar LIKE
'baz%')". Perhaps that's not very wise, performance-wise (pun
intended) ?


Am I right in assuming that many of these limitations are present
because of the relation to XPath, and mapping of SQL queries to an
XPath-based query model internally in JackRabbit 1.4 ?



Some background information
Our CMS uses a strictly hierarchical resource model, with
file-resources, collection-resources and properties. It is implemented
using the Spring framework and provides normal web access, in addition
to access through the WebDAV protocol. At the bottom, we have our own
repository/backend implementation which handles persistence of all
resources and a custom Lucene-based search/query implementation. This is
where JCR comes into the picture, we are experimenting with using JCR
for both of these tasks (persistence and query). 

We have our own query language, parser and query object model. I am
currently working on mapping this to equivalent JCR 1.0 SQL syntax for
our experimental JCR-based backend. In our language, we have support for
a few constructs which are based around URIs (which is the hierarchical
identifier we use for all resources in our CMS). These URIs are directly
mapped to corresponding JCR node paths, hence all the questions about
queries on jcr:path :). Our own Lucene-based query implementation has
special fields for optimizing the performance of queries which are based
on the hierarchy of resources. 

What should we expect of support for these types of queries in future
versions of JackRabbit, JCR 2.0, etc ?

Thanks for any answers in advance :)

Regards,
Øyvind Stegard
IT-department, University of Oslo, Norway
-- 
< Øyvind Stegard ~ oyvind stegard at usit uio no
 < USIT, UiO


Re: JCR 1.0 SQL queries with JackRabbit

Posted by Marcel Reutegger <ma...@gmx.net>.
Øyvind Stegard wrote:
> Yes, I've been causally browsing this document for a while. Looking at
> the roadmap for the JackRabbit project, I see that JCR2.0-support is in
> the medium term time category, any rough ideas of how long that actually
> is :) ? I can't seem to find any indications elsewhere.

we are currently working on the new features defined in JSR 283, though most of 
them are currently hidden because the new methods are not yet available in the 
JCR interfaces.

I'd say it is going to be Q2 or Q3 this year. We are pretty much tied to the JSR 
283 schedule, since Jackrabbit will be the basis for the reference 
implementation. looking at http://jcp.org/en/jsr/detail?id=283 it says:

2007.12.11: Updated Schedule:
[...]
Mar-2008 Proposed Final draft submitted
May-2008 Final release

the dates already slipped a couple of times and that may well happen one more time.

regards
  marcel

Re: JCR 1.0 SQL queries with JackRabbit

Posted by Øyvind Stegard <oy...@usit.uio.no>.
man, 17.03.2008 kl. 15.36 +0100, Marcel Reutegger:
> Hi Øyvind,
> 
Hi !

<snip>
> In JCR 1.0 only one basic path constraint per query is possible. It boils down to:
> 1) a node with a given path
> 2) nodes, which are children of a given path
> 3) nodes, which are descendants of a given path
> 
Ok. What confuses me somewhat by this is that the spec says that
descendant or self must be supported (section 6.6.3.4 "Path Constraint",
page 108). Maybe there's something I've misunderstood, it's not really a
big deal.

<snip>
> > 2)
> > Is there any feasible way of querying for path depth using the jcr:path
> > pseudo property ? When talking about "path depth", I am only concerned
> > about absolute node paths, where "/" has depth 0, "/foo" has depth 1 and
> > so on. It is the same as the depth concept in
> > org.apache.jackrabbit.spi.Path#getDepth(), but only for absolute paths. 
<snip>
> no, this is not possible with SQL but you can use XPath:
> 
> /jcr:root/*/*[@jcr:primaryType = 'nt:file' or @jcr:primaryType = 'nt:folder']
> 
> or you can use the common base type:
> 
> /jcr:root/*/element(*, nt:hierarchyNode)
> 
I see, that works well :)


<snip>
> > It seems that logical inversion of LIKE queries on jcr:path is ignored
> > by JackRabbit.
> > 
> > Two examples:
> > 
> > SELECT * FROM nt:base WHERE jcr:path NOT LIKE '/foo/%'
> > and
> > SELECT * FROM nt:base WHERE NOT (jcr:path LIKE '/foo/%')
<snip>
> This is a bug in Jackrabbit and should actually throw an InvalidQueryException.
> 
Ok.


> > 4)
> > It seems that "NOT LIKE .." generally does not invert matching for any
> > kind of JCR property ?
> 
> JCR 1.0 does not specify a <property-name> NOT LIKE <string-literal>, which 
> means jackrabbit should probably also throw an InvalidQueryException in this case.
> 
> > However, I could work around that by wrapping a
> > regular "LIKE" inside a NOT(...) expression, i.e. "NOT (foo:bar LIKE
> > 'baz%')". Perhaps that's not very wise, performance-wise (pun
> > intended) ?
> 
> yes, and it's also not exactly the same. While foo:bar not like 'baz%' returns 
> all nodes with a foo:bar not starting with 'baz', not(foo:bar like 'baz%') will 
> also return nodes that do not even have the property.
Ah, I realize this now. Perhaps I can add a
".. AND foo:bar IS NOT NULL"-clause which should fix that logical flaw.


<snip>
> > What should we expect of support for these types of queries in future
> > versions of JackRabbit, JCR 2.0, etc ?
> 
> The features that Jackrabbit will support in the future will be those defined in 
> JSR 283. You can download the public preview here: 
> http://jcp.org/aboutJava/communityprocess/pr/jsr283/
> 
Yes, I've been causally browsing this document for a while. Looking at
the roadmap for the JackRabbit project, I see that JCR2.0-support is in
the medium term time category, any rough ideas of how long that actually
is :) ? I can't seem to find any indications elsewhere.


> In addition there will be Jackrabbit proprietary extensions driven by the 
> user/developer community. We did this already in the past when we saw a need for 
> certain features that were not specified in JSR 170.
> 


Thanks for the reply ! 

Regards,
Øyvind S.
-- 
< Øyvind Stegard ~ oyvind stegard at usit uio no
 < USIT, UiO


Re: JCR 1.0 SQL queries with JackRabbit

Posted by Marcel Reutegger <ma...@gmx.net>.
Hi Øyvind,

Øyvind Stegard wrote:
> 1) 
> I am having difficulties with some queries which are based on the
> jcr:path pseudo property:
> 
> Descendant or self (similar example found on page 305 of the JCR 1.0
> spec):
> SELECT * FROM nt:base WHERE jcr:path = '/foo' OR jcr:path LIKE '/foo/%'
> 
> Here I get the following error: Invalid combination of jcr:path clauses
> 
> The problem seems to be general, in that I am unable to combine multiple
> jcr:path clauses in a single query (with a few exceptions, see question
> 2). Even a simple expression like "jcr:path = '/foo' OR jcr:path =
> '/bar'" fails.

In JCR 1.0 only one basic path constraint per query is possible. It boils down to:
1) a node with a given path
2) nodes, which are children of a given path
3) nodes, which are descendants of a given path

> If this is a known limitation in JackRabbit 1.4, will such queries be
> possible in JCR 2.0, using the SQL2 syntax or QOM ? I would certainly
> prefer using a qom directly.

yes, this will be possible. In JCR 2.0 you can combine multiple path constraints.

> 2)
> Is there any feasible way of querying for path depth using the jcr:path
> pseudo property ? When talking about "path depth", I am only concerned
> about absolute node paths, where "/" has depth 0, "/foo" has depth 1 and
> so on. It is the same as the depth concept in
> org.apache.jackrabbit.spi.Path#getDepth(), but only for absolute paths. 
> 
> For instance, for a query that lists file/folder nodes at exactly depth
> 1, i.e. the children of the root node, one would think something like
> this works**:
> 
> SELECT * FROM nt:base WHERE (jcr:primaryType = 'nt:file' OR
> jcr:primaryType = 'nt:folder') AND jcr:path LIKE '/%' AND NOT jcr:path
> LIKE '/%/%'
> 
> **nt:file and nt:folder only used as example, in reality we have our own
> custom types representing the file/folder concept.
> 
> And in fact, this particular query works as expected (even though it
> contains multiple jcr:path clauses). It is just like the "Child" SQL
> query example on page 305 of the JCR 1.0 spec.
> 
> Generalising this approach for depth-N, however, seems to not work, for
> instance for depth 2:
> 
> SELECT * FROM nt:base WHERE (jcr:primaryType = 'nt:file' OR
> jcr:primaryType = 'nt:folder') AND jcr:path LIKE '/%/%' AND NOT jcr:path
> LIKE '/%/%/%'
> 
> This one returns a result set containing every nt:file/nt:folder-node in
> the repository with depth greater than 1 (we only want exactly 2). Is
> this kind of query possible at all when based on jcr:path ?

no, this is not possible with SQL but you can use XPath:

/jcr:root/*/*[@jcr:primaryType = 'nt:file' or @jcr:primaryType = 'nt:folder']

or you can use the common base type:

/jcr:root/*/element(*, nt:hierarchyNode)

> I could add an integer property containing the path depth for every node
> which represents a resource in our CMS, and use that property in
> queries. But that property value would be a direct function of jcr:path,
> and would need updating whenever jcr:path changes, that's something I do
> *not* want to do.
> 
> Supporting path depth clauses in queries might be considered somewhat
> eccentric, but it is something we currently support in our CMS with our
> own backend query implementation. Some general background information
> about our CMS is provided at the end of this email.
> 
> 
> 3)
> It seems that logical inversion of LIKE queries on jcr:path is ignored
> by JackRabbit.
> 
> Two examples:
> 
> SELECT * FROM nt:base WHERE jcr:path NOT LIKE '/foo/%'
> and
> SELECT * FROM nt:base WHERE NOT (jcr:path LIKE '/foo/%')
> 
> returns exactly the same result set as:
> 
> SELECT * FROM nt:base WHERE jcr:path LIKE '/foo/%'
> 
> So the "NOT" part seems to be completely ignored. I suspect this
> limitation is because of performance-concerns and the Lucene-based query
> backend ?

This is a bug in Jackrabbit and should actually throw an InvalidQueryException.

> 4)
> It seems that "NOT LIKE .." generally does not invert matching for any
> kind of JCR property ?

JCR 1.0 does not specify a <property-name> NOT LIKE <string-literal>, which 
means jackrabbit should probably also throw an InvalidQueryException in this case.

> However, I could work around that by wrapping a
> regular "LIKE" inside a NOT(...) expression, i.e. "NOT (foo:bar LIKE
> 'baz%')". Perhaps that's not very wise, performance-wise (pun
> intended) ?

yes, and it's also not exactly the same. While foo:bar not like 'baz%' returns 
all nodes with a foo:bar not starting with 'baz', not(foo:bar like 'baz%') will 
also return nodes that do not even have the property.

> Am I right in assuming that many of these limitations are present
> because of the relation to XPath, and mapping of SQL queries to an
> XPath-based query model internally in JackRabbit 1.4 ?

yes, kind of, though the actual limitations are specified in JSR 170.

> Some background information
> Our CMS uses a strictly hierarchical resource model, with
> file-resources, collection-resources and properties. It is implemented
> using the Spring framework and provides normal web access, in addition
> to access through the WebDAV protocol. At the bottom, we have our own
> repository/backend implementation which handles persistence of all
> resources and a custom Lucene-based search/query implementation. This is
> where JCR comes into the picture, we are experimenting with using JCR
> for both of these tasks (persistence and query). 
> 
> We have our own query language, parser and query object model. I am
> currently working on mapping this to equivalent JCR 1.0 SQL syntax for
> our experimental JCR-based backend. In our language, we have support for
> a few constructs which are based around URIs (which is the hierarchical
> identifier we use for all resources in our CMS). These URIs are directly
> mapped to corresponding JCR node paths, hence all the questions about
> queries on jcr:path :). Our own Lucene-based query implementation has
> special fields for optimizing the performance of queries which are based
> on the hierarchy of resources. 
> 
> What should we expect of support for these types of queries in future
> versions of JackRabbit, JCR 2.0, etc ?

The features that Jackrabbit will support in the future will be those defined in 
JSR 283. You can download the public preview here: 
http://jcp.org/aboutJava/communityprocess/pr/jsr283/

In addition there will be Jackrabbit proprietary extensions driven by the 
user/developer community. We did this already in the past when we saw a need for 
certain features that were not specified in JSR 170.

regards
  marcel