You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2001/03/19 23:55:18 UTC

[Bug 1031] New - Recursive XPath iteration is slow...

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1031

*** shadow/1031	Mon Mar 19 14:55:18 2001
--- shadow/1031.tmp.7412	Mon Mar 19 14:55:18 2001
***************
*** 0 ****
--- 1,148 ----
+ +============================================================================+
+ | Recursive XPath iteration is slow...                                       |
+ +----------------------------------------------------------------------------+
+ |        Bug #: 1031                        Product: XalanJ2                 |
+ |       Status: NEW                         Version: 2.0.0                   |
+ |   Resolution:                            Platform: Sun                     |
+ |     Severity: Minor                    OS/Version: Solaris                 |
+ |     Priority: Low                       Component: org.apache.xpath        |
+ +----------------------------------------------------------------------------+
+ |  Assigned To: xalan-dev@xml.apache.org                                     |
+ |  Reported By: Frederick_P_Stluka@sbphrd.com                                |
+ |      CC list: Cc:                                                          |
+ +----------------------------------------------------------------------------+
+ |          URL:                                                              |
+ +============================================================================+
+ |                              DESCRIPTION                                   |
+ Overview Description: 
+ 
+     The following XPath runs much slower than expected:
+         .//compounds/compound[1]/protocols/protocol[1]/centers/center[1]//month
+ 
+ Importance to me:
+ 
+     This is not at all important to me, since I can use the 2nd workaround
+     shown below.  However, Xalan is a great product, so I figured I'd take
+     the time to report this, and give you the opportunity to fix it.
+ 
+ Details:
+ 
+     I am working with an XML tree that looks like:
+ 
+         <report>
+          <compounds>
+           <compound>
+            <compoundNum>111111</compoundNum>
+            <protocols>
+             <protocol>
+              <protocolNum>001</protocolNum>
+              <centers>
+               <center>
+                <centerNum>001</centerNum>
+                <countryId>1</countryId>
+                <countryName>US</countryName>
+                <investigator>John Smith</investigator>
+                <centerActive>Y</centerActive>
+                <assignedContainers>Y</assignedContainers>
+                <unassignedContainers>Y</unassignedContainers>
+                <totalContainers>Y</totalContainers>
+                <months>
+                 <month>
+                  <yearNum>2001</yearNum>
+                  <monthNum>01</monthNum>
+                  <monthName>Jan</monthName>
+                  <containerCount>17</containerCount>
+                 </month>
+                 ...
+                </months>
+               </center>
+               ...
+              </centers>
+             </protocol>
+             ...
+            </protocols>
+           </compound>
+           ...
+          </compounds>
+         </report>
+ 
+     I know for certain that there is exactly one <compounds> node in the 
+     entire XML tree.  However, I want to avoid hardcoding the absolute 
+     location of that node into the application that parses the XML.  
+     Therefore, I'd rather refer to that node as:
+         //compounds
+     than:
+         /report/compounds
+ 
+     My test case has 1 <report> node containing 1 <compounds> node 
+     containing 1 <compound> node, which contains 1 <compoundNum> node 
+     and 1 <protocols> node.  The <protocols> node contains 1 <protocol>
+     node which contains 1 <protocolNum> node and 1 <centers> node.
+     The <centers> node contains 101 <center> nodes, each of which 
+     contains 1 each of the following nodes:  <centerNum>, <countryId>,
+     <countryName>, <investigator>, <centerActive>, <assignedContainers>,
+     <unassignedContainers>, <totalContainers>, <months>.  Each <months>
+     node contains 14 <month> nodes, each of which contains 1 each of:
+     <yearNum>, <monthNum>, <monthName>, <containerCount>.  Therefore, 
+     there are a total of 1414 <month> nodes.
+ 
+     The following code snippet executes in about 73 seconds:
+ 
+         NodeIterator xmlMonthIterator = XPathAPI.selectNodeIterator
+             (xmlDOM, 
+              
+ ".//compounds/compound[1]/protocols/protocol[1]/centers/center[1]//month");
+         Node xmlMonth1 = xmlMonthIterator.nextNode();
+         while (xmlMonth1 != null)
+         {
+             do_something();
+             xmlMonth1 = xmlMonthIterator.nextNode();
+         }
+ 
+     The XPath is intended to iterate over all <month> nodes of the first
+     <center> of the first <protocol> of the first <compound>.  As expected,
+     it quickly (less than 0.25 seconds) iterates through those 14 nodes.  
+     However, it then spends nearly 73 seconds on the call to nextNode() 
+     during the last iteration.  It seems to be searching through the rest 
+     of the <center> nodes in the rest of the <protocol> nodes in the rest 
+     of the <compound> nodes.
+ 
+ Workaround 1:
+ 
+     I can save avoid the 73 second delay by using a different XPath, as:
+ 
+         NodeIterator xmlMonthIterator = XPathAPI.selectNodeIterator
+             (xmlDOM, 
+              
+ "/report/compounds/compound[1]/protocols/protocol[1]/centers/center[1]//month");
+         Node xmlMonth1 = xmlMonthIterator.nextNode();
+         while (xmlMonth1 != null)
+         {
+             do_something();
+             xmlMonth1 = xmlMonthIterator.nextNode();
+         }
+ 
+     However, I prefer to not specify the location of the <compounds> node
+     so exactly.  
+ 
+ Workaround 2:
+ 
+     Alternatively, I can avoid the 73 second delay by navigating the tree 
+     in two steps, as:
+ 
+         Node xmlCenter1 = XPathAPI.selectSingleNode
+             (xmlDOM,
+              
+ ".//compounds/compound[1]/protocols/protocol[1]/centers/center[1]");
+         NodeIterator xmlMonthIterator = XPathAPI.selectNodeIterator
+             (xmlCenter1, 
+              ".//month");
+         Node xmlMonth1 = xmlMonthIterator.nextNode();
+         while (xmlMonth1 != null)
+         {
+             do_something();
+             xmlMonth1 = xmlMonthIterator.nextNode();
+         }
+ 
+ --Fred Stluka
+ 3/19/2001