You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Jan Grathwohl <ja...@kontrast.de> on 2008/05/28 12:19:19 UTC
NodeIterator drops nodes when sorting
Hi,
we are having the problem in our application that an XPath query does
not return all results that should be in it. Some debugging through
the Jackrabbit internals showed us that the NodeIterator from the
query result received the right UUIDs from the Lucene index, but then
removed some of them from the list because the sorting of the results
failed.
The situation is that we have some nodes in the results list where
not all of the node's ancestors are accessible for the Session
(blocked by our AccessManager). We receive a DocOrderNodeIteratorImpl
from the query that contains these nodes, and this iterator tries to
sort the nodes before the first method call that accesses them. The
comparator then gets an AccessDeniedException from the getAncestor()
of one of the nodes, and removes these nodes with unaccessable
ancestors from the node list. It also looks like the Comparator
directly removes both compared nodes from the result list if one of
them throws an Exception when being compared.
Is this wanted behaviour, that nodes won't be returned by a query
when they cannot be sorted? And is it generally supported in
JackRabbit to have nodes whose ancestors are not accessible?
We could work around that by turning off the sorting of the nodes, we
don't need sorted query results here. Is there a way to achieve this
trough JCR or Jackrabbit API? We are currently doing this by
accessing the private
org.apache.jackrabbit.core.query.lucene.QueryImpl object from the
query result through Java reflection, and then calling a
setRespectDocumentOrder(false) on it. But maybe there is a nicer way
(as probably almost any way would be nicer) to achieve the same result?
I will attach the XPath query and Exception stack trace from our log
file.
Best regards and Thanks,
Jan
15:55:53,194 DEBUG [tcmdataaccess] QueryString is: //element(*,
tcs:category) [fn:lower-case(@tcs:defaultContentType) =
'information'] /element(*, tcs:categorylocalization) [ @tcs:locview =
'pngo' and @tcs:loclanguage = 'de']
15:56:41,836 ERROR [DocOrderNodeIteratorImpl] Exception while sorting
nodes in document order: javax.jcr.AccessDeniedException: cannot read
item 827cae10-ad2e-44ad-927f-a65e96e0d4f2
javax.jcr.AccessDeniedException: cannot read item 827cae10-
ad2e-44ad-927f-a65e96e0d4f2
at org.apache.jackrabbit.core.ItemManager.getItem(ItemManager.java:392)
at org.apache.jackrabbit.core.ItemManager.getNode(ItemManager.java:350)
at org.apache.jackrabbit.core.ItemImpl.getAncestor(ItemImpl.java:1403)
at org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl
$1.compare(DocOrderNodeIteratorImpl.java:220)
at java.util.Arrays.mergeSort(Arrays.java:1284)
at java.util.Arrays.mergeSort(Arrays.java:1295)
at java.util.Arrays.sort(Arrays.java:1223)
at
org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.initOrd
eredIterator(DocOrderNodeIteratorImpl.java:172)
at
org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.hasNext
(DocOrderNodeIteratorImpl.java:131)
at
kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.g
etContentList(ContentSearchImpl.java:267)
at
kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.f
indContents(ContentSearchImpl.java:187)
at
kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.p
erformQuery(ContentSearchImpl.java:117)
at
kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.g
etResults(ContentSearchImpl.java:89)
RE: NodeIterator drops nodes when sorting
Posted by Ard Schrijvers <a....@onehippo.com>.
Hello,
>
> To clarify the situation: I have a node with the path a/b/c
> in the XPath query result, node c can be accessed, but a and
> b cannot. But the node c will then also be removed from the
> result list, because the document order cannot be created
> because of its unauthorized ancestors. So I have the
> situation that node c will be removed from the result
> although is is an authorized and accessible node, only its
> ancestors are not.
Think is part of *how* you implemented the AccessManager I think.
> > Which version of Jackrabbit are you using? First of all, for
> > JackRabbit
> >> 1.5 the default setting for respectDocumentOrder will be
> false. For <
> > 1.5 it is true. You can configure it to false/true by adding
> >
> > <param name="respectDocumentOrder" value="false"/>
> >
> > To you <SearchIndex> element in repository.xml
>
> I use Jackrabbit 1.4. I implemented the solution the
> Sébastien suggested, adding a "order by @jcr:score" to the
> query. Works very well for me.
Perfect.
Ard
>
> Regards and Thanks,
>
> Jan
>
>
>
Re: NodeIterator drops nodes when sorting
Posted by Jan Grathwohl <ja...@kontrast.de>.
Hi,
>> The situation is that we have some nodes in the results list
>> where not all of the node's ancestors are accessible for the
>> Session (blocked by our AccessManager). We receive a
>> DocOrderNodeIteratorImpl from the query that contains these
>> nodes, and this iterator tries to sort the nodes before the
>> first method call that accesses them. The comparator then
>> gets an AccessDeniedException from the getAncestor() of one
>> of the nodes, and removes these nodes with unaccessable
>
> This seems correct to me, isn't?
Is it? I really don't know, but it is a least a behaviour that I was
not aware of.
To clarify the situation: I have a node with the path a/b/c in the
XPath query result, node c can be accessed, but a and b cannot. But
the node c will then also be removed from the result list, because
the document order cannot be created because of its unauthorized
ancestors. So I have the situation that node c will be removed from
the result although is is an authorized and accessible node, only its
ancestors are not.
If I think it through, It is somehow logical in itself: default for
missing "order by" specification is to create document order ->
document order cannot be created if ancestors are not accessible ->
node is not in the result. But it is a kind of pitfall when you're
not aware of it.
>> ancestors from the node list. It also looks like the
>> Comparator directly removes both compared nodes from the
>> result list if one of them throws an Exception when being compared.
>
> And this is the actual error/problem, isn't? If correct, only the node
> that cannot be accessed should be removed, right?
Yes, I would also consider that to be wrong, it should not behave
like that.
> Which version of Jackrabbit are you using? First of all, for
> JackRabbit
>> 1.5 the default setting for respectDocumentOrder will be false. For <
> 1.5 it is true. You can configure it to false/true by adding
>
> <param name="respectDocumentOrder" value="false"/>
>
> To you <SearchIndex> element in repository.xml
I use Jackrabbit 1.4. I implemented the solution the Sébastien
suggested, adding a "order by @jcr:score" to the query. Works very
well for me.
Regards and Thanks,
Jan
RE: NodeIterator drops nodes when sorting
Posted by Ard Schrijvers <a....@onehippo.com>.
Hello,
> The situation is that we have some nodes in the results list
> where not all of the node's ancestors are accessible for the
> Session (blocked by our AccessManager). We receive a
> DocOrderNodeIteratorImpl from the query that contains these
> nodes, and this iterator tries to sort the nodes before the
> first method call that accesses them. The comparator then
> gets an AccessDeniedException from the getAncestor() of one
> of the nodes, and removes these nodes with unaccessable
This seems correct to me, isn't?
> ancestors from the node list. It also looks like the
> Comparator directly removes both compared nodes from the
> result list if one of them throws an Exception when being compared.
And this is the actual error/problem, isn't? If correct, only the node
that cannot be accessed should be removed, right?
>
> Is this wanted behaviour, that nodes won't be returned by a
> query when they cannot be sorted? And is it generally
> supported in JackRabbit to have nodes whose ancestors are not
> accessible?
>
> We could work around that by turning off the sorting of the
> nodes, we don't need sorted query results here. Is there a
> way to achieve this trough JCR or Jackrabbit API? We are
> currently doing this by accessing the private
> org.apache.jackrabbit.core.query.lucene.QueryImpl object from
> the query result through Java reflection, and then calling a
> setRespectDocumentOrder(false) on it. But maybe there is a
> nicer way (as probably almost any way would be nicer) to
> achieve the same result?
Which version of Jackrabbit are you using? First of all, for JackRabbit
> 1.5 the default setting for respectDocumentOrder will be false. For <
1.5 it is true. You can configure it to false/true by adding
<param name="respectDocumentOrder" value="false"/>
To you <SearchIndex> element in repository.xml
Regards Ard
>
> I will attach the XPath query and Exception stack trace from
> our log file.
>
> Best regards and Thanks,
>
> Jan
>
>
Re: NodeIterator drops nodes when sorting
Posted by Jan Grathwohl <ja...@kontrast.de>.
Hi Sébastien,
"order by @jcr:score" is exactly what I was looking for.
Thank you very much.
Jan
Am 28.05.2008 um 20:12 schrieb Sébastien Launay:
> Hi Jan,
>
> By setting the JavaBean property respectDocumentOrder to false
> on SearchIndex you can disable document order sorting of query
> results :
> <SearchIndex
> class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
> ...
> <param name="respectDocumentOrder" value="false"/>
> ...
> </SearchIndex>
>
> Indeed, QueryImpl instances will be injected with this property.
> This can also be forced using "order by @jcr:score" in the query.
>
> But this does not changed the fact that unauthorized nodes will
> not be retrieved from the query and will certainly log warnings
> (from what i have seen AccessDeniedException is not considered
> differently than a RepositoryException).
> Another important behavior is that NodeIterator#getSize()
> implementation
> may (and will in your case) decrement while iterating over the nodes.
>
> I think this is needed to get the best performance.
>
> Jan Grathwohl wrote:
>> Hi,
>>
>> we are having the problem in our application that an XPath query
>> does not return all results that should be in it. Some debugging
>> through the Jackrabbit internals showed us that the NodeIterator
>> from the query result received the right UUIDs from the Lucene
>> index, but then removed some of them from the list because the
>> sorting of the results failed.
>>
>> The situation is that we have some nodes in the results list where
>> not all of the node's ancestors are accessible for the Session
>> (blocked by our AccessManager). We receive a
>> DocOrderNodeIteratorImpl from the query that contains these nodes,
>> and this iterator tries to sort the nodes before the first method
>> call that accesses them. The comparator then gets an
>> AccessDeniedException from the getAncestor() of one of the nodes,
>> and removes these nodes with unaccessable ancestors from the node
>> list. It also looks like the Comparator directly removes both
>> compared nodes from the result list if one of them throws an
>> Exception when being compared.
>>
>> Is this wanted behaviour, that nodes won't be returned by a query
>> when they cannot be sorted? And is it generally supported in
>> JackRabbit to have nodes whose ancestors are not accessible?
>>
>> We could work around that by turning off the sorting of the nodes,
>> we don't need sorted query results here. Is there a way to achieve
>> this trough JCR or Jackrabbit API? We are currently doing this by
>> accessing the private
>> org.apache.jackrabbit.core.query.lucene.QueryImpl object from the
>> query result through Java reflection, and then calling a
>> setRespectDocumentOrder(false) on it. But maybe there is a nicer
>> way (as probably almost any way would be nicer) to achieve the
>> same result?
>>
>> I will attach the XPath query and Exception stack trace from our
>> log file.
>>
>> Best regards and Thanks,
>>
>> Jan
>>
>>
>> 15:55:53,194 DEBUG [tcmdataaccess] QueryString is: //element(*,
>> tcs:category) [fn:lower-case(@tcs:defaultContentType) =
>> 'information'] /element(*, tcs:categorylocalization)
>> [ @tcs:locview = 'pngo' and @tcs:loclanguage = 'de']
>> 15:56:41,836 ERROR [DocOrderNodeIteratorImpl] Exception while
>> sorting nodes in document order: javax.jcr.AccessDeniedException:
>> cannot read item 827cae10-ad2e-44ad-927f-a65e96e0d4f2
>> javax.jcr.AccessDeniedException: cannot read item 827cae10-
>> ad2e-44ad-927f-a65e96e0d4f2
>> at org.apache.jackrabbit.core.ItemManager.getItem
>> (ItemManager.java:392)
>> at org.apache.jackrabbit.core.ItemManager.getNode
>> (ItemManager.java:350)
>> at org.apache.jackrabbit.core.ItemImpl.getAncestor
>> (ItemImpl.java:1403)
>> at
>> org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl
>> $1.compare(DocOrderNodeIteratorImpl.java:220)
>> at java.util.Arrays.mergeSort(Arrays.java:1284)
>> at java.util.Arrays.mergeSort(Arrays.java:1295)
>> at java.util.Arrays.sort(Arrays.java:1223)
>> at
>> org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.init
>> OrderedIterator(DocOrderNodeIteratorImpl.java:172)
>> at
>> org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.hasN
>> ext(DocOrderNodeIteratorImpl.java:131)
>> at
>> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImp
>> l.getContentList(ContentSearchImpl.java:267)
>> at
>> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImp
>> l.findContents(ContentSearchImpl.java:187)
>> at
>> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImp
>> l.performQuery(ContentSearchImpl.java:117)
>> at
>> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImp
>> l.getResults(ContentSearchImpl.java:89)
>
Re: NodeIterator drops nodes when sorting
Posted by Sébastien Launay <se...@anyware-tech.com>.
Hi Jan,
By setting the JavaBean property respectDocumentOrder to false
on SearchIndex you can disable document order sorting of query results :
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
...
<param name="respectDocumentOrder" value="false"/>
...
</SearchIndex>
Indeed, QueryImpl instances will be injected with this property.
This can also be forced using "order by @jcr:score" in the query.
But this does not changed the fact that unauthorized nodes will
not be retrieved from the query and will certainly log warnings
(from what i have seen AccessDeniedException is not considered
differently than a RepositoryException).
Another important behavior is that NodeIterator#getSize() implementation
may (and will in your case) decrement while iterating over the nodes.
I think this is needed to get the best performance.
Jan Grathwohl wrote:
> Hi,
>
> we are having the problem in our application that an XPath query does
> not return all results that should be in it. Some debugging through
> the Jackrabbit internals showed us that the NodeIterator from the
> query result received the right UUIDs from the Lucene index, but then
> removed some of them from the list because the sorting of the results
> failed.
>
> The situation is that we have some nodes in the results list where not
> all of the node's ancestors are accessible for the Session (blocked by
> our AccessManager). We receive a DocOrderNodeIteratorImpl from the
> query that contains these nodes, and this iterator tries to sort the
> nodes before the first method call that accesses them. The comparator
> then gets an AccessDeniedException from the getAncestor() of one of
> the nodes, and removes these nodes with unaccessable ancestors from
> the node list. It also looks like the Comparator directly removes both
> compared nodes from the result list if one of them throws an Exception
> when being compared.
>
> Is this wanted behaviour, that nodes won't be returned by a query when
> they cannot be sorted? And is it generally supported in JackRabbit to
> have nodes whose ancestors are not accessible?
>
> We could work around that by turning off the sorting of the nodes, we
> don't need sorted query results here. Is there a way to achieve this
> trough JCR or Jackrabbit API? We are currently doing this by accessing
> the private org.apache.jackrabbit.core.query.lucene.QueryImpl object
> from the query result through Java reflection, and then calling a
> setRespectDocumentOrder(false) on it. But maybe there is a nicer way
> (as probably almost any way would be nicer) to achieve the same result?
>
> I will attach the XPath query and Exception stack trace from our log
> file.
>
> Best regards and Thanks,
>
> Jan
>
>
> 15:55:53,194 DEBUG [tcmdataaccess] QueryString is: //element(*,
> tcs:category) [fn:lower-case(@tcs:defaultContentType) = 'information']
> /element(*, tcs:categorylocalization) [ @tcs:locview = 'pngo' and
> @tcs:loclanguage = 'de']
> 15:56:41,836 ERROR [DocOrderNodeIteratorImpl] Exception while sorting
> nodes in document order: javax.jcr.AccessDeniedException: cannot read
> item 827cae10-ad2e-44ad-927f-a65e96e0d4f2
> javax.jcr.AccessDeniedException: cannot read item
> 827cae10-ad2e-44ad-927f-a65e96e0d4f2
> at
> org.apache.jackrabbit.core.ItemManager.getItem(ItemManager.java:392)
> at
> org.apache.jackrabbit.core.ItemManager.getNode(ItemManager.java:350)
> at
> org.apache.jackrabbit.core.ItemImpl.getAncestor(ItemImpl.java:1403)
> at
> org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl$1.compare(DocOrderNodeIteratorImpl.java:220)
>
> at java.util.Arrays.mergeSort(Arrays.java:1284)
> at java.util.Arrays.mergeSort(Arrays.java:1295)
> at java.util.Arrays.sort(Arrays.java:1223)
> at
> org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.initOrderedIterator(DocOrderNodeIteratorImpl.java:172)
>
> at
> org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.hasNext(DocOrderNodeIteratorImpl.java:131)
>
> at
> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.getContentList(ContentSearchImpl.java:267)
>
> at
> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.findContents(ContentSearchImpl.java:187)
>
> at
> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.performQuery(ContentSearchImpl.java:117)
>
> at
> kontrast.toshiba.datastore.accessimpl.content.search.ContentSearchImpl.getResults(ContentSearchImpl.java:89)
>