You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2001/03/19 23:55:18 UTC
[Bug 1031] New - Recursive XPath iteration is slow...
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1031
*** shadow/1031 Mon Mar 19 14:55:18 2001
--- shadow/1031.tmp.7412 Mon Mar 19 14:55:18 2001
***************
*** 0 ****
--- 1,148 ----
+ +============================================================================+
+ | Recursive XPath iteration is slow... |
+ +----------------------------------------------------------------------------+
+ | Bug #: 1031 Product: XalanJ2 |
+ | Status: NEW Version: 2.0.0 |
+ | Resolution: Platform: Sun |
+ | Severity: Minor OS/Version: Solaris |
+ | Priority: Low Component: org.apache.xpath |
+ +----------------------------------------------------------------------------+
+ | Assigned To: xalan-dev@xml.apache.org |
+ | Reported By: Frederick_P_Stluka@sbphrd.com |
+ | CC list: Cc: |
+ +----------------------------------------------------------------------------+
+ | URL: |
+ +============================================================================+
+ | DESCRIPTION |
+ Overview Description:
+
+ The following XPath runs much slower than expected:
+ .//compounds/compound[1]/protocols/protocol[1]/centers/center[1]//month
+
+ Importance to me:
+
+ This is not at all important to me, since I can use the 2nd workaround
+ shown below. However, Xalan is a great product, so I figured I'd take
+ the time to report this, and give you the opportunity to fix it.
+
+ Details:
+
+ I am working with an XML tree that looks like:
+
+ <report>
+ <compounds>
+ <compound>
+ <compoundNum>111111</compoundNum>
+ <protocols>
+ <protocol>
+ <protocolNum>001</protocolNum>
+ <centers>
+ <center>
+ <centerNum>001</centerNum>
+ <countryId>1</countryId>
+ <countryName>US</countryName>
+ <investigator>John Smith</investigator>
+ <centerActive>Y</centerActive>
+ <assignedContainers>Y</assignedContainers>
+ <unassignedContainers>Y</unassignedContainers>
+ <totalContainers>Y</totalContainers>
+ <months>
+ <month>
+ <yearNum>2001</yearNum>
+ <monthNum>01</monthNum>
+ <monthName>Jan</monthName>
+ <containerCount>17</containerCount>
+ </month>
+ ...
+ </months>
+ </center>
+ ...
+ </centers>
+ </protocol>
+ ...
+ </protocols>
+ </compound>
+ ...
+ </compounds>
+ </report>
+
+ I know for certain that there is exactly one <compounds> node in the
+ entire XML tree. However, I want to avoid hardcoding the absolute
+ location of that node into the application that parses the XML.
+ Therefore, I'd rather refer to that node as:
+ //compounds
+ than:
+ /report/compounds
+
+ My test case has 1 <report> node containing 1 <compounds> node
+ containing 1 <compound> node, which contains 1 <compoundNum> node
+ and 1 <protocols> node. The <protocols> node contains 1 <protocol>
+ node which contains 1 <protocolNum> node and 1 <centers> node.
+ The <centers> node contains 101 <center> nodes, each of which
+ contains 1 each of the following nodes: <centerNum>, <countryId>,
+ <countryName>, <investigator>, <centerActive>, <assignedContainers>,
+ <unassignedContainers>, <totalContainers>, <months>. Each <months>
+ node contains 14 <month> nodes, each of which contains 1 each of:
+ <yearNum>, <monthNum>, <monthName>, <containerCount>. Therefore,
+ there are a total of 1414 <month> nodes.
+
+ The following code snippet executes in about 73 seconds:
+
+ NodeIterator xmlMonthIterator = XPathAPI.selectNodeIterator
+ (xmlDOM,
+
+ ".//compounds/compound[1]/protocols/protocol[1]/centers/center[1]//month");
+ Node xmlMonth1 = xmlMonthIterator.nextNode();
+ while (xmlMonth1 != null)
+ {
+ do_something();
+ xmlMonth1 = xmlMonthIterator.nextNode();
+ }
+
+ The XPath is intended to iterate over all <month> nodes of the first
+ <center> of the first <protocol> of the first <compound>. As expected,
+ it quickly (less than 0.25 seconds) iterates through those 14 nodes.
+ However, it then spends nearly 73 seconds on the call to nextNode()
+ during the last iteration. It seems to be searching through the rest
+ of the <center> nodes in the rest of the <protocol> nodes in the rest
+ of the <compound> nodes.
+
+ Workaround 1:
+
+ I can save avoid the 73 second delay by using a different XPath, as:
+
+ NodeIterator xmlMonthIterator = XPathAPI.selectNodeIterator
+ (xmlDOM,
+
+ "/report/compounds/compound[1]/protocols/protocol[1]/centers/center[1]//month");
+ Node xmlMonth1 = xmlMonthIterator.nextNode();
+ while (xmlMonth1 != null)
+ {
+ do_something();
+ xmlMonth1 = xmlMonthIterator.nextNode();
+ }
+
+ However, I prefer to not specify the location of the <compounds> node
+ so exactly.
+
+ Workaround 2:
+
+ Alternatively, I can avoid the 73 second delay by navigating the tree
+ in two steps, as:
+
+ Node xmlCenter1 = XPathAPI.selectSingleNode
+ (xmlDOM,
+
+ ".//compounds/compound[1]/protocols/protocol[1]/centers/center[1]");
+ NodeIterator xmlMonthIterator = XPathAPI.selectNodeIterator
+ (xmlCenter1,
+ ".//month");
+ Node xmlMonth1 = xmlMonthIterator.nextNode();
+ while (xmlMonth1 != null)
+ {
+ do_something();
+ xmlMonth1 = xmlMonthIterator.nextNode();
+ }
+
+ --Fred Stluka
+ 3/19/2001