You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-users@xml.apache.org by Pe...@tietoenator.com on 2002/05/02 10:23:08 UTC

XPath - Limiting the number of results returned

Hi,

I'm currently using Xindice to store metadata for an ebXML registry
implementation, and have come across an issue when using XPath to query the
contents of a collection. In some cases (using wildcard searches or common
keys), I can get back very large sets of matching results (say 1000 +
fragments). For performance reasons, I would like to limit the number of XML
fragments returned to say no more than 50 - is there anyway this support can
be added to Xindice (I could make the code change myself if needed)? What
I'm looking for is a way to interrupt an XPath query operation once the
specified maximum number of matches (i.e. 50) has been reached, and then
return that set of matched fragments. 

Also, is it possible to return 50 results and then cache all other matches
in the background so that the user can quickly access other results if
needed, once they've reviewed the first 50 results. I'm looking for the kind
of functionality provided by most search engines, where the user can specify
how many results they want to see, and can also move through the result set
page by page, without the extra overhead of waiting for all 1000 + results
to be found (which can take ages with large documents)... 

Any help would be greatly appreciated.

-- Peter

Re: XPath - Limiting the number of results returned

Posted by Jeff Greif <jg...@alumni.princeton.edu>.
I'm not an Xindice developer, so am more or less guessing about this.

Your first request is probably hard to satisfy without modifying the query
engine (including breaking the standardized xmldb api).  Most of the mods
would be simple -- to pass in a result-cardinality-bound parameter.  The
XPathQueryResolver code would have the substantial mods.  The execute method
would take the upper-bound.  The case where there are no indexes is easy,
since you could just use the upper-bound as a loop test in the collection
scan (if you were willing to get only the first 50 records found, not the
first 50 in key order).  The harder case is with indexes.  Note that I'm not
suggesting that you hack the XPathQueryResolver and all the code that
invokes it -- some kind of design of a parallel, more specialized set of
classes might be better.

Your second request is for a browsing capability.  RDBMS systems implement
the browsing capability using a thing called a cursor, which is essentially
a hook into the results of a query, which must be fully executed to
guarantee result consistency.  Typically the results are stored in some
temporary location in the DB (possibly even in memory) and the cursor acts
like a bookmark on the list.  For different purposes, there are cursors
which live client side and others that live server side (f'rinstance, a
server-side cursor may be used internally to carry out a complex join in an
RDBMS, or inside a stored-procedure where there is server-side code that
iterates through a result set).  There are cursors of varying complexity
(which can be optimized to a greater or lesser degree).  A forward-only
cursor can be optimized by discarding results already seen.  A browsing
cursor might allow you to skip over pages in the results.  There is some
documentation of this kind of thing in the JDBC documentation.  Probably
cursors could be implemented as XMLObjects.

Jeff
----- Original Message -----
From: <Pe...@tietoenator.com>
To: <xi...@xml.apache.org>
Sent: Thursday, May 02, 2002 1:23 AM
Subject: XPath - Limiting the number of results returned


> Hi,
>
> I'm currently using Xindice to store metadata for an ebXML registry
> implementation, and have come across an issue when using XPath to query
the
> contents of a collection. In some cases (using wildcard searches or common
> keys), I can get back very large sets of matching results (say 1000 +
> fragments). For performance reasons, I would like to limit the number of
XML
> fragments returned to say no more than 50 - is there anyway this support
can
> be added to Xindice (I could make the code change myself if needed)? What
> I'm looking for is a way to interrupt an XPath query operation once the
> specified maximum number of matches (i.e. 50) has been reached, and then
> return that set of matched fragments.
>
> Also, is it possible to return 50 results and then cache all other matches
> in the background so that the user can quickly access other results if
> needed, once they've reviewed the first 50 results. I'm looking for the
kind
> of functionality provided by most search engines, where the user can
specify
> how many results they want to see, and can also move through the result
set
> page by page, without the extra overhead of waiting for all 1000 + results
> to be found (which can take ages with large documents)...
>
> Any help would be greatly appreciated.
>
> -- Peter
>
>