You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-users@xalan.apache.org by sh...@e-z.net on 2017/02/19 00:54:53 UTC

Re: Why evaluating \"/site/*\" is super slow?

>
> Dear all,
>
> I'm a Ph.D. student in Kochi University of Technology in Japan.
> I recently got started to use Xalan for some implementation work
> as a part of a project, and I encountered an efficiency problem.
> Here is the problematic function I designed for evaluating XPath
> queries.
>
> 1:  public void process(org.w3c.dom.Document doc, String xquery){
> 2:      ... // skipped some irrelevant code
> 3:      NodeList nodes = XPathAPI.selectSingleNode(doc, xquery);
> 4:      ... // skipped some irrelevant code
> 5:  }
>
> In Line 3, an XPath query is applied to an XML document. The processing
> was done on an document sized 596 MB generated by xmlgen from XMark
> project. The query is rather simple "/site/*". Since there are only six
> children of "site", it should take very short time. However, it took
> 3.6 seconds.
>
> Then I tried to evaluate it directly by the getChildNodes() function on
> the root of the document, and it took only 0.003 second for the same
> query.
> 1:   Node root = doc.getChildNodes().item(0);
> 2:   NodeList chs = root.getChildNodes();
>
> Although this efficiency was pretty good, it is hard for me to evaluate
> complex queries. I spent a couple of days trying to solve this problem,
> but still I have no idea. So, could you please give me some help about
> how to improve the efficiency of Xalan?
>
> Thank you in advance.
>
>
> Best regards,
> Wei HAO.
>
>
I am releasing this to the J-USERS group.

Re: Why evaluating \"/site/*\" is super slow?

Posted by Wei HAO <18...@gs.kochi-tech.ac.jp>.

Hi, Christoffer,

Thanks for your reply.
I will try to fix the problem based on the information you provided.


Best regards,
Wei HAO

On 2017/2/22 16:58, Christoffer Bruun wrote:
> Hi,
>
> This is because the Xalan XPATH engine creates some internal lookup
> tables/indexes *every time* you perform a xpath query.
>
> You notice this on large documents when creating the lookup tables
> consumes much more time than the actual query.
>
> You could look into
> https://xalan.apache.org/old/xalan-j/apidocs/org/apache/xpath/CachedXPathAPI.html
>
>
> br
>
> Christoffer Bruun
>
>
> Den 2/19/2017 kl. 1:54 AM skrev shathawa@e-z.net:
>>> Dear all,
>>>
>>> I'm a Ph.D. student in Kochi University of Technology in Japan.
>>> I recently got started to use Xalan for some implementation work
>>> as a part of a project, and I encountered an efficiency problem.
>>> Here is the problematic function I designed for evaluating XPath
>>> queries.
>>>
>>> 1:  public void process(org.w3c.dom.Document doc, String xquery){
>>> 2:      ... // skipped some irrelevant code
>>> 3:      NodeList nodes = XPathAPI.selectSingleNode(doc, xquery);
>>> 4:      ... // skipped some irrelevant code
>>> 5:  }
>>>
>>> In Line 3, an XPath query is applied to an XML document. The processing
>>> was done on an document sized 596 MB generated by xmlgen from XMark
>>> project. The query is rather simple "/site/*". Since there are only six
>>> children of "site", it should take very short time. However, it took
>>> 3.6 seconds.
>>>
>>> Then I tried to evaluate it directly by the getChildNodes() function on
>>> the root of the document, and it took only 0.003 second for the same
>>> query.
>>> 1:   Node root = doc.getChildNodes().item(0);
>>> 2:   NodeList chs = root.getChildNodes();
>>>
>>> Although this efficiency was pretty good, it is hard for me to evaluate
>>> complex queries. I spent a couple of days trying to solve this problem,
>>> but still I have no idea. So, could you please give me some help about
>>> how to improve the efficiency of Xalan?
>>>
>>> Thank you in advance.
>>>
>>>
>>> Best regards,
>>> Wei HAO.
>>>
>>>
>> I am releasing this to the J-USERS group.
>>
>
>

Re: Why evaluating \"/site/*\" is super slow?

Posted by Christoffer Bruun <cd...@flyingpigs.dk>.

Hi,

This is because the Xalan XPATH engine creates some internal lookup 
tables/indexes *every time* you perform a xpath query.

You notice this on large documents when creating the lookup tables 
consumes much more time than the actual query.

You could look into 
https://xalan.apache.org/old/xalan-j/apidocs/org/apache/xpath/CachedXPathAPI.html

br

Christoffer Bruun


Den 2/19/2017 kl. 1:54 AM skrev shathawa@e-z.net:
>> Dear all,
>>
>> I'm a Ph.D. student in Kochi University of Technology in Japan.
>> I recently got started to use Xalan for some implementation work
>> as a part of a project, and I encountered an efficiency problem.
>> Here is the problematic function I designed for evaluating XPath
>> queries.
>>
>> 1:  public void process(org.w3c.dom.Document doc, String xquery){
>> 2:      ... // skipped some irrelevant code
>> 3:      NodeList nodes = XPathAPI.selectSingleNode(doc, xquery);
>> 4:      ... // skipped some irrelevant code
>> 5:  }
>>
>> In Line 3, an XPath query is applied to an XML document. The processing
>> was done on an document sized 596 MB generated by xmlgen from XMark
>> project. The query is rather simple "/site/*". Since there are only six
>> children of "site", it should take very short time. However, it took
>> 3.6 seconds.
>>
>> Then I tried to evaluate it directly by the getChildNodes() function on
>> the root of the document, and it took only 0.003 second for the same
>> query.
>> 1:   Node root = doc.getChildNodes().item(0);
>> 2:   NodeList chs = root.getChildNodes();
>>
>> Although this efficiency was pretty good, it is hard for me to evaluate
>> complex queries. I spent a couple of days trying to solve this problem,
>> but still I have no idea. So, could you please give me some help about
>> how to improve the efficiency of Xalan?
>>
>> Thank you in advance.
>>
>>
>> Best regards,
>> Wei HAO.
>>
>>
> I am releasing this to the J-USERS group.
>