You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-dev@xml.apache.org by Koen Gogne <ko...@nss.be> on 2002/11/08 11:49:08 UTC

speeding up XPATH

Dear all,

I have set up a collection with lots of (small) xml-files.
Querying these files takes very (unacceptably) long. 

Is there any way to speed this up ?

I have tried using xindiceadmin 'add_indexer', but that doesn't seem to
have any effect.
I've also tried to by-pass the xindice/xindiceadmin-tools by running the
query from within a java script, hoping it would speed up the process.
No luck however.

Can anyone tell me a solution ?
Can anyone provide me with some sample java-code that does the trick ?

greetz,
Koen Gogne
---------------------------
Koen Gogne
NSS nv
Tieltstraat 167
8740 Pittem
Tel. ++32 (0)51 42 40 15
Fax. ++32(0)51 40 29 22
---------------------------



RE: speeding up XPATH

Posted by Jim Wissner <ji...@jbrix.org>.
I have not done any benchmarking either, but I can add at least anecdotal 
evidence, having worked on several projects using different XML parsers, 
that Xalan's XPathAPI can perform very poory even with simple queries.  In 
the past I've gone so far as to add simple heuristics to evaluate simple 
xpaths manually and bypass XPathAPI altogether.

As a side note, Jaxen (dom4j's xpath engine) does seem quite a bit faster, 
and iirc, it is modular and will work with DOM just fine - you needn't 
switch your code base to dom4j in order to use it.

Jim


At 07:03 PM 11/8/2002 +0000, you wrote:
>I understand the difference between // and /*/* and indeed the tests all
>seem to use /*/*
>
>I guess I went on to read the conclusions at the end of the tests which
>state
>
>------------------------------------------------------------
>Conclusion
>These number suggest one should use the XPathAPI class of Xalan with
>great    caution, if at all
>
>The syntax of Xpath statements must be chosen carefully. Contrary to
>some belief, and of the topology of our XML format, using /*/* or // was
>most efficient compared to the absolute path /ItemResultSet/Item
>
>It appears more efficient to use selectNodes with Dom4j even if one
>needs a single node.
>
>With DOM4J, it is about twice as fast when running XPath against a
>document which contains elements vs attributes.
>
>In our case, we found that Dom4j is faster than Xalant for XSLT
>transformations. We do not claim this is  a general result, but rather a
>datapoint
>-------------------------------------------------------------
>
>Perhaps the author has drawn incorrect conclusions form the tests but
>paragraph 2 refers to // as well as /*/*
>
>As I said, I've not personally done any benchmarking and so I don't
>know.
>
>John
>
>
>-----Original Message-----
>From: Dave Viner [mailto:dviner@yahoo-inc.com]
>Sent: 08 November 2002 18:03
>To: xindice-dev@xml.apache.org
>Subject: RE: speeding up XPATH
>
>i don't think it does... or perhaps i'm misreading it.  the xpath
>expression
>evaluated in the test are :
>
>/*/*/Attr1x1
>/*/*/Attr1x500
>/*/*/Attr1x999
>/*/*/Item
>/*/*[@id="1"]
>/*/*[@id="500"]
>/*/*[@id="999""]
>
>These are not the '//' style expressions.  An example of the '//' style
>expression might be:
>
>//title
>
>where the xml looked like:
>
><book>
>         <title>Foo bar</title>
>         <chapter num="1">
>                 <title>Why XPath Rocks</title>
>                 <para>Xpath helps me do my job.</para>
>         </chapter>
>         <chapter num="2">
>                 <title>Why XPath Sucks</title>
>                 <para>Too much to learn.</para>
>                 <section>
>                         <title>Yada Yada Yada</title>
>                         <para>blah blah blash</para>
>                 </section>
>         </chapter>
></book>
>
>The '//title' xpath expression would return 'Foo bar', 'Why XPath
>Rocks',
>'Why XPath Sucks', and 'Yada Yada Yada'.
>
>See also http://www.topxml.com/xsl/XPathRef.asp for info on the //
>shorthand.
>
>dave
>
>
>
>-----Original Message-----
>From: John Mc Quillan [mailto:john.mcquillan@openjawtech.com]
>Sent: Friday, November 08, 2002 9:42 AM
>To: xindice-dev@xml.apache.org
>Subject: RE: speeding up XPATH
>
>
>Have a look at the following link, which suggests that the contrary
>might be true. I've not verified this or looked at the code to try and
>figure out why but the tests seem to suggests that // type constructs
>are in fact faster.
>
>http://dom4j.org/benchmarks/xpath/index.html
>
>John Mc Quillan
>+353 (01) 4100681
>http://www.openjawtech.com
>
>
>-----Original Message-----
>From: Dave Viner [mailto:dviner@yahoo-inc.com]
>Sent: 08 November 2002 17:29
>To: xindice-dev@xml.apache.org
>Subject: RE: speeding up XPATH
>
>Also, the exact XPath expression makes a major difference in speed.  In
>general, avoid '//' type queries because they force the engine to
>examine
>the entire tree.  Could you supply the xpath expressions you are using ?
>
>dave
>
>
>-----Original Message-----
>From: Terry Rosenbaum [mailto:Terry@amicas.com]
>Sent: Friday, November 08, 2002 4:21 AM
>To: xindice-dev@xml.apache.org
>Subject: Re: speeding up XPATH
>
>
>  > Is there any way to speed ... [xpath] up ?
>
>One thing that came up recently on this list
>was a defficiency in xerces causing slow XPath
>searching. A workaround for the problem is to
>supply the following command line arguments to
>your JVM:
>
>-Dorg.apache.xml.dtm.DTMManager=org.apache.xml.dtm.ref.DTMManagerDefault
>to
>your JVM.
>
>(You could also place the necessary property in system properties from
>within
>your program.)
>
>See: http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2
><http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2>
>-Terry
>
>Koen Gogne wrote:
>
> >Dear all,
> >
> >I have set up a collection with lots of (small) xml-files.
> >Querying these files takes very (unacceptably) long.
> >
> >Is there any way to speed this up ?
> >
> >I have tried using xindiceadmin 'add_indexer', but that doesn't seem to
> >have any effect.
> >I've also tried to by-pass the xindice/xindiceadmin-tools by running
>the
> >query from within a java script, hoping it would speed up the process.
> >No luck however.
> >
> >Can anyone tell me a solution ?
> >Can anyone provide me with some sample java-code that does the trick ?
> >
> >greetz,
> >Koen Gogne
> >---------------------------
> >Koen Gogne
> >NSS nv
> >Tieltstraat 167
> >8740 Pittem
> >Tel. ++32 (0)51 42 40 15
> >Fax. ++32(0)51 40 29 22
> >---------------------------
> >
> >
> >
> >

--
jim@jbrix.org

Visit www.jbrix.org for:
   + SpeedJAVA jEdit Code Completion Plugin
   + Xybrix XML Application Framework
   + other great Open Source Software


RE: speeding up XPATH

Posted by John Mc Quillan <jo...@openjawtech.com>.
I understand the difference between // and /*/* and indeed the tests all
seem to use /*/* 

I guess I went on to read the conclusions at the end of the tests which
state

------------------------------------------------------------
Conclusion
These number suggest one should use the XPathAPI class of Xalan with
great    caution, if at all

The syntax of Xpath statements must be chosen carefully. Contrary to
some belief, and of the topology of our XML format, using /*/* or // was
most efficient compared to the absolute path /ItemResultSet/Item 

It appears more efficient to use selectNodes with Dom4j even if one
needs a single node.

With DOM4J, it is about twice as fast when running XPath against a
document which contains elements vs attributes. 

In our case, we found that Dom4j is faster than Xalant for XSLT
transformations. We do not claim this is  a general result, but rather a
datapoint 
-------------------------------------------------------------

Perhaps the author has drawn incorrect conclusions form the tests but
paragraph 2 refers to // as well as /*/*

As I said, I've not personally done any benchmarking and so I don't
know.

John


-----Original Message-----
From: Dave Viner [mailto:dviner@yahoo-inc.com] 
Sent: 08 November 2002 18:03
To: xindice-dev@xml.apache.org
Subject: RE: speeding up XPATH

i don't think it does... or perhaps i'm misreading it.  the xpath
expression
evaluated in the test are :

/*/*/Attr1x1
/*/*/Attr1x500
/*/*/Attr1x999
/*/*/Item
/*/*[@id="1"]
/*/*[@id="500"]
/*/*[@id="999""]

These are not the '//' style expressions.  An example of the '//' style
expression might be:

//title

where the xml looked like:

<book>
	<title>Foo bar</title>
	<chapter num="1">
		<title>Why XPath Rocks</title>
		<para>Xpath helps me do my job.</para>
	</chapter>
	<chapter num="2">
		<title>Why XPath Sucks</title>
		<para>Too much to learn.</para>
		<section>
			<title>Yada Yada Yada</title>
			<para>blah blah blash</para>
		</section>
	</chapter>
</book>

The '//title' xpath expression would return 'Foo bar', 'Why XPath
Rocks',
'Why XPath Sucks', and 'Yada Yada Yada'.

See also http://www.topxml.com/xsl/XPathRef.asp for info on the //
shorthand.

dave



-----Original Message-----
From: John Mc Quillan [mailto:john.mcquillan@openjawtech.com]
Sent: Friday, November 08, 2002 9:42 AM
To: xindice-dev@xml.apache.org
Subject: RE: speeding up XPATH


Have a look at the following link, which suggests that the contrary
might be true. I've not verified this or looked at the code to try and
figure out why but the tests seem to suggests that // type constructs
are in fact faster.

http://dom4j.org/benchmarks/xpath/index.html

John Mc Quillan
+353 (01) 4100681
http://www.openjawtech.com


-----Original Message-----
From: Dave Viner [mailto:dviner@yahoo-inc.com]
Sent: 08 November 2002 17:29
To: xindice-dev@xml.apache.org
Subject: RE: speeding up XPATH

Also, the exact XPath expression makes a major difference in speed.  In
general, avoid '//' type queries because they force the engine to
examine
the entire tree.  Could you supply the xpath expressions you are using ?

dave


-----Original Message-----
From: Terry Rosenbaum [mailto:Terry@amicas.com]
Sent: Friday, November 08, 2002 4:21 AM
To: xindice-dev@xml.apache.org
Subject: Re: speeding up XPATH


 > Is there any way to speed ... [xpath] up ?

One thing that came up recently on this list
was a defficiency in xerces causing slow XPath
searching. A workaround for the problem is to
supply the following command line arguments to
your JVM:

-Dorg.apache.xml.dtm.DTMManager=org.apache.xml.dtm.ref.DTMManagerDefault
to
your JVM.

(You could also place the necessary property in system properties from
within
your program.)

See: http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2
<http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2>
-Terry

Koen Gogne wrote:

>Dear all,
>
>I have set up a collection with lots of (small) xml-files.
>Querying these files takes very (unacceptably) long.
>
>Is there any way to speed this up ?
>
>I have tried using xindiceadmin 'add_indexer', but that doesn't seem to
>have any effect.
>I've also tried to by-pass the xindice/xindiceadmin-tools by running
the
>query from within a java script, hoping it would speed up the process.
>No luck however.
>
>Can anyone tell me a solution ?
>Can anyone provide me with some sample java-code that does the trick ?
>
>greetz,
>Koen Gogne
>---------------------------
>Koen Gogne
>NSS nv
>Tieltstraat 167
>8740 Pittem
>Tel. ++32 (0)51 42 40 15
>Fax. ++32(0)51 40 29 22
>---------------------------
>
>
>
>









RE: speeding up XPATH

Posted by Dave Viner <dv...@yahoo-inc.com>.
i don't think it does... or perhaps i'm misreading it.  the xpath expression
evaluated in the test are :

/*/*/Attr1x1
/*/*/Attr1x500
/*/*/Attr1x999
/*/*/Item
/*/*[@id="1"]
/*/*[@id="500"]
/*/*[@id="999""]

These are not the '//' style expressions.  An example of the '//' style
expression might be:

//title

where the xml looked like:

<book>
	<title>Foo bar</title>
	<chapter num="1">
		<title>Why XPath Rocks</title>
		<para>Xpath helps me do my job.</para>
	</chapter>
	<chapter num="2">
		<title>Why XPath Sucks</title>
		<para>Too much to learn.</para>
		<section>
			<title>Yada Yada Yada</title>
			<para>blah blah blash</para>
		</section>
	</chapter>
</book>

The '//title' xpath expression would return 'Foo bar', 'Why XPath Rocks',
'Why XPath Sucks', and 'Yada Yada Yada'.

See also http://www.topxml.com/xsl/XPathRef.asp for info on the //
shorthand.

dave



-----Original Message-----
From: John Mc Quillan [mailto:john.mcquillan@openjawtech.com]
Sent: Friday, November 08, 2002 9:42 AM
To: xindice-dev@xml.apache.org
Subject: RE: speeding up XPATH


Have a look at the following link, which suggests that the contrary
might be true. I've not verified this or looked at the code to try and
figure out why but the tests seem to suggests that // type constructs
are in fact faster.

http://dom4j.org/benchmarks/xpath/index.html

John Mc Quillan
+353 (01) 4100681
http://www.openjawtech.com


-----Original Message-----
From: Dave Viner [mailto:dviner@yahoo-inc.com]
Sent: 08 November 2002 17:29
To: xindice-dev@xml.apache.org
Subject: RE: speeding up XPATH

Also, the exact XPath expression makes a major difference in speed.  In
general, avoid '//' type queries because they force the engine to
examine
the entire tree.  Could you supply the xpath expressions you are using ?

dave


-----Original Message-----
From: Terry Rosenbaum [mailto:Terry@amicas.com]
Sent: Friday, November 08, 2002 4:21 AM
To: xindice-dev@xml.apache.org
Subject: Re: speeding up XPATH


 > Is there any way to speed ... [xpath] up ?

One thing that came up recently on this list
was a defficiency in xerces causing slow XPath
searching. A workaround for the problem is to
supply the following command line arguments to
your JVM:

-Dorg.apache.xml.dtm.DTMManager=org.apache.xml.dtm.ref.DTMManagerDefault
to
your JVM.

(You could also place the necessary property in system properties from
within
your program.)

See: http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2
<http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2>
-Terry

Koen Gogne wrote:

>Dear all,
>
>I have set up a collection with lots of (small) xml-files.
>Querying these files takes very (unacceptably) long.
>
>Is there any way to speed this up ?
>
>I have tried using xindiceadmin 'add_indexer', but that doesn't seem to
>have any effect.
>I've also tried to by-pass the xindice/xindiceadmin-tools by running
the
>query from within a java script, hoping it would speed up the process.
>No luck however.
>
>Can anyone tell me a solution ?
>Can anyone provide me with some sample java-code that does the trick ?
>
>greetz,
>Koen Gogne
>---------------------------
>Koen Gogne
>NSS nv
>Tieltstraat 167
>8740 Pittem
>Tel. ++32 (0)51 42 40 15
>Fax. ++32(0)51 40 29 22
>---------------------------
>
>
>
>







RE: speeding up XPATH

Posted by John Mc Quillan <jo...@openjawtech.com>.
Have a look at the following link, which suggests that the contrary
might be true. I've not verified this or looked at the code to try and
figure out why but the tests seem to suggests that // type constructs
are in fact faster.

http://dom4j.org/benchmarks/xpath/index.html

John Mc Quillan
+353 (01) 4100681
http://www.openjawtech.com


-----Original Message-----
From: Dave Viner [mailto:dviner@yahoo-inc.com] 
Sent: 08 November 2002 17:29
To: xindice-dev@xml.apache.org
Subject: RE: speeding up XPATH

Also, the exact XPath expression makes a major difference in speed.  In
general, avoid '//' type queries because they force the engine to
examine
the entire tree.  Could you supply the xpath expressions you are using ?

dave


-----Original Message-----
From: Terry Rosenbaum [mailto:Terry@amicas.com]
Sent: Friday, November 08, 2002 4:21 AM
To: xindice-dev@xml.apache.org
Subject: Re: speeding up XPATH


 > Is there any way to speed ... [xpath] up ?

One thing that came up recently on this list
was a defficiency in xerces causing slow XPath
searching. A workaround for the problem is to
supply the following command line arguments to
your JVM:

-Dorg.apache.xml.dtm.DTMManager=org.apache.xml.dtm.ref.DTMManagerDefault
to
your JVM.

(You could also place the necessary property in system properties from
within
your program.)

See: http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2
<http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2>
-Terry

Koen Gogne wrote:

>Dear all,
>
>I have set up a collection with lots of (small) xml-files.
>Querying these files takes very (unacceptably) long.
>
>Is there any way to speed this up ?
>
>I have tried using xindiceadmin 'add_indexer', but that doesn't seem to
>have any effect.
>I've also tried to by-pass the xindice/xindiceadmin-tools by running
the
>query from within a java script, hoping it would speed up the process.
>No luck however.
>
>Can anyone tell me a solution ?
>Can anyone provide me with some sample java-code that does the trick ?
>
>greetz,
>Koen Gogne
>---------------------------
>Koen Gogne
>NSS nv
>Tieltstraat 167
>8740 Pittem
>Tel. ++32 (0)51 42 40 15
>Fax. ++32(0)51 40 29 22
>---------------------------
>
>
>
>






RE: speeding up XPATH

Posted by Dave Viner <dv...@yahoo-inc.com>.
Also, the exact XPath expression makes a major difference in speed.  In
general, avoid '//' type queries because they force the engine to examine
the entire tree.  Could you supply the xpath expressions you are using ?

dave


-----Original Message-----
From: Terry Rosenbaum [mailto:Terry@amicas.com]
Sent: Friday, November 08, 2002 4:21 AM
To: xindice-dev@xml.apache.org
Subject: Re: speeding up XPATH


 > Is there any way to speed ... [xpath] up ?

One thing that came up recently on this list
was a defficiency in xerces causing slow XPath
searching. A workaround for the problem is to
supply the following command line arguments to
your JVM:

-Dorg.apache.xml.dtm.DTMManager=org.apache.xml.dtm.ref.DTMManagerDefault to
your JVM.

(You could also place the necessary property in system properties from
within
your program.)

See: http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2
<http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2>
-Terry

Koen Gogne wrote:

>Dear all,
>
>I have set up a collection with lots of (small) xml-files.
>Querying these files takes very (unacceptably) long.
>
>Is there any way to speed this up ?
>
>I have tried using xindiceadmin 'add_indexer', but that doesn't seem to
>have any effect.
>I've also tried to by-pass the xindice/xindiceadmin-tools by running the
>query from within a java script, hoping it would speed up the process.
>No luck however.
>
>Can anyone tell me a solution ?
>Can anyone provide me with some sample java-code that does the trick ?
>
>greetz,
>Koen Gogne
>---------------------------
>Koen Gogne
>NSS nv
>Tieltstraat 167
>8740 Pittem
>Tel. ++32 (0)51 42 40 15
>Fax. ++32(0)51 40 29 22
>---------------------------
>
>
>
>




Re: speeding up XPATH

Posted by Terry Rosenbaum <Te...@amicas.com>.
 > Is there any way to speed ... [xpath] up ?

One thing that came up recently on this list
was a defficiency in xerces causing slow XPath
searching. A workaround for the problem is to
supply the following command line arguments to
your JVM:

-Dorg.apache.xml.dtm.DTMManager=org.apache.xml.dtm.ref.DTMManagerDefault to
your JVM.

(You could also place the necessary property in system properties from 
within
your program.)

See: http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2
<http://marc.theaimsgroup.com/?l=xindice-dev&m=103470582225684&w=2>
-Terry

Koen Gogne wrote:

>Dear all,
>
>I have set up a collection with lots of (small) xml-files.
>Querying these files takes very (unacceptably) long. 
>
>Is there any way to speed this up ?
>
>I have tried using xindiceadmin 'add_indexer', but that doesn't seem to
>have any effect.
>I've also tried to by-pass the xindice/xindiceadmin-tools by running the
>query from within a java script, hoping it would speed up the process.
>No luck however.
>
>Can anyone tell me a solution ?
>Can anyone provide me with some sample java-code that does the trick ?
>
>greetz,
>Koen Gogne
>---------------------------
>Koen Gogne
>NSS nv
>Tieltstraat 167
>8740 Pittem
>Tel. ++32 (0)51 42 40 15
>Fax. ++32(0)51 40 29 22
>---------------------------
>
>
>  
>