You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by Tobias Wahlström <to...@pricerunner.com> on 2002/06/13 14:53:16 UTC

Re: Were did all resorces disapear? (The mystery is solved!)

I guess this starts to look silly - I am answering my own mails... But I 
guess that this is interesting in a more general sence.

I have solved all my problems now, or at least worked around them. The 
reson that my application ran out of memory was that when you return a 
org.w3c.dom.Node from an extension element a new DTM is allocated. The 
DTM will not be released, thus heavy use of such extension elements will 
cause a lot of DTM's to be created.

My fix is to take the DOM result and add it to the result tree in 
another way than returning it. I wrote a class DOM2SAXWalker (probably 
there are a class doing this job) which take a DOM tree and call a 
ContentHandler properly. As content handler I used the 
ResultTreeHandler. This way no additional DTM's is created and my 
application does not run out of memory.

Why don't xalan do in a similar way? Why does it go through a DTM?


Best regards,
Tobias


Tobias Wahlström wrote:

> I have found one of the reasons for my problems. We have started to 
> use j2sdk1.4.0_01 from Sun for linux. The packages javax.xml.* 
> (recursive) has been bundled into the jar-file 
> $JAVA_HOME/jre/lib/rt.jar. Sun is so "kind" that they provide an 
> implementation of the interfaces too, Crimson and Xalan. So it don't 
> matter which jar's I add in my classpath unless I strip off rt.jar to 
> not include any xml parser and transformer. According to my opion they 
> shouldn't put things they don't support in their own jar's. If the 
> things are old and not so good it is even worse...
>
> My former problems has been solved by this, but I got some new ones.
>
> I have heard that this new DTM should keep decrease the memory use... 
> Hmmm... I get java.lang.OutOfMemoryError all the time when I try to do 
> transformations. I have tried to modify my application as:
>
> 1) Parse the input using a custom parser and write the result to a 
> temp-file.
> 2) Use the javax.xml.* packages to parse the temp-file and transform 
> it with a stylesheet and output it as a file again.
>
> The first step works fine. The second step runs out of memory. I have 
> added trace print outs in the xalan code to see what happends. The 
> parser is xerces and xalan is run in incremental mode - shouldn't that 
> decrease memory usage? Btw. I can mention, again, that my input data 
> is very large, but the stylesheets are simple. I normally match one 
> node and then
>
> I also trace DTM allocation, and I found out that 2380 DTM's is 
> allocated but only one is released, should it be like that?
> I have thrown an exception (which is immediately caught) to see where 
> DTM's are created and it turns out that most DTM's are created at:
>
>        [...]
>        at 
> org.apache.xml.dtm.ref.DTMManagerDefault.addDTM(DTMManagerDefault.java:158) 
>
>        at 
> org.apache.xml.dtm.ref.DTMManagerDefault.getDTM(DTMManagerDefault.java:279) 
>
>        at 
> org.apache.xml.dtm.ref.DTMManagerDefault.getDTMHandleFromNode(DTMManagerDefault.java:570) 
>
>        at 
> org.apache.xpath.XPathContext.getDTMHandleFromNode(XPathContext.java:214)
>        at 
> org.apache.xalan.extensions.XSLProcessorContext.outputToResultTree(XSLProcessorContext.java:263) 
>
>        at 
> org.apache.xalan.extensions.ExtensionHandlerJavaClass.processElement(ExtensionHandlerJavaClass.java:423) 
>
>        at 
> org.apache.xalan.templates.ElemExtensionCall.execute(ElemExtensionCall.java:307) 
>
>        at 
> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2243) 
>
>        at 
> org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:710) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:422) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:422) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:422) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:422) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:422) 
>
>        at 
> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>
>        at 
> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2243) 
>
>        at 
> org.apache.xalan.transformer.TransformerImpl.applyTemplateToNode(TransformerImpl.java:2069) 
>
>        at 
> org.apache.xalan.transformer.TransformerImpl.transformNode(TransformerImpl.java:1171) 
>
>        at 
> org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:634) 
>
>        at 
> org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1088) 
>
>        at 
> org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1066) 
>
>        [...]
>
> Is this a good or a bad thing?
> Btw. I'm using extensions a lot, to do some magic. Is this generally a 
> bad idea?
>
>
>
> Best regards,
> Tobias
>
>
>
> Tobias Wahlström wrote:
>
>>
>>
>> Joseph Kesselman wrote:
>>
>>> On Tuesday, 06/11/2002 at 09:16 ZE2, Tobias Wahlström 
>>> <to...@pricerunner.com> wrote:
>>> > I will try to be more exact about how I use xalan, but I can't 
>>> send you
>>> > the code since it is for an commercial company.
>>>
>>> We don't necessarily need the "real" code -- you could probably come 
>>> up with a simplified driver which simulates your usage pattern and 
>>> provokes the same problem. That's a nontrivial bit of coding work, 
>>> admittedly, but either you or we will probably be stuck with doing 
>>> it...
>>>
>> Well... the system is rather complex so that would be a tideous work
>>
>>>
>>>
>>> > I use extensions written in Java to enchance xalan with a stack if
>>> > "process-orders". When processing an "xml-document" with a stylesheet
>>> > new "process-orders" are pushed onto the stack, and processed when 
>>> the
>>> > first one is done.
>>>
>>> I'm really not convinced I'm understanding the data flow you're 
>>> describing. Could you draw a diagram, or give a step-by-step 
>>> description of how data is actually flowing through Xalan? Where and 
>>> how do the extensions deliver their data -- as part of the main 
>>> Xalan input stream, as DOM trees returned to Xalan, as something else?
>>>
>> My application provide SAX events which is provided to Xalan as 
>> input. Xalan is (often) invoked by using the same 
>> javax.transform.Templates object several times on different input. 
>> The output is created from all the processings. The document is 
>> started before the first processing is done, the document elemenet is 
>> also added. After that all processings are made and the document 
>> elements for all those are removed by my SAX event filter. When all 
>> processings are done the document element is ended and the document 
>> closed.
>>
>> Within a single processing my extension is invoked pretty frequently. 
>> Often the content of my extension elements is evaluated to DOM by 
>> changing the ContentHandler to an object that build DOM-trees and the 
>> method ElemTemplateElement.execute(Transformer t) is invoked. When 
>> the extension is done and wants to put things in the output (which it 
>> often want) then almost always a text node is returned to xalan.
>>
>>>
>>>
>>> > I have tried a simple example which only does two 
>>> xslt-processings. If I
>>> > do the job as is it will run out of DTM id's. If I limit the match
>>> > pattern so that only one element is matched, and much less output is
>>> > produced, it works fine. The size of the input document is less than
>>> > 400K, so it can't be the size of the input that is the problem.
>>>
>>> How deeply recursive is your stylesheet? How heavily is it using 
>>> XSLT variables?
>>>
>> Very simple stylesheet, normally not recursive at all. Variables is 
>> used very little, but some.
>>
>>>
>>>
>>>
>>> > doesn't make any difference. It burps out loads of exceptions, the 
>>> log
>>> > file is over 18M.
>>>
>>> The number of exceptions doesn't tell me anything very useful. 
>>> Patterns of where the exceptions are occurring might.
>>>
>> Well I understand that, but I wanted to express that it was a 
>> repeated error. I can fill in that the exception is the same all the 
>> time, at least the top seven levels of the trace, and that it looks 
>> like this:
>>
>> javax.xml.transform.TransformerException: 
>> org.apache.xml.dtm.DTMException: No more DTM IDs are available
>>    at 
>> org.apache.xalan.templates.ElemExtensionCall.execute(ElemExtensionCall.java:325) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2182) 
>>
>>    at 
>> org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:678) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2182) 
>>
>>    at 
>> org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:678) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2182) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.applyTemplateToNode(TransformerImpl.java:2008) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.transformNode(TransformerImpl.java:1171) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.run(TransformerImpl.java:3135) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerHandlerImpl.endDocument(TransformerHandlerImpl.java:433) 
>>
>>    [... my code ...]
>> Caused by: org.apache.xml.dtm.DTMException: No more DTM IDs are 
>> available
>>    at 
>> org.apache.xml.dtm.ref.DTMManagerDefault.getFirstFreeDTMID(DTMManagerDefault.java:134) 
>>
>>    at 
>> org.apache.xml.dtm.ref.DTMManagerDefault.getDTM(DTMManagerDefault.java:184) 
>>
>>    at 
>> org.apache.xml.dtm.ref.DTMManagerDefault.getDTMHandleFromNode(DTMManagerDefault.java:438) 
>>
>>    at 
>> org.apache.xpath.XPathContext.getDTMHandleFromNode(XPathContext.java:195) 
>>
>>    at 
>> org.apache.xalan.extensions.XSLProcessorContext.outputToResultTree(XSLProcessorContext.java:257) 
>>
>>    at 
>> org.apache.xalan.extensions.ExtensionHandlerJavaClass.processElement(ExtensionHandlerJavaClass.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemExtensionCall.execute(ElemExtensionCall.java:307) 
>>
>>    ... 45 more
>> ---------
>> org.apache.xml.dtm.DTMException: No more DTM IDs are available
>>    at 
>> org.apache.xml.dtm.ref.DTMManagerDefault.getFirstFreeDTMID(DTMManagerDefault.java:134) 
>>
>>    at 
>> org.apache.xml.dtm.ref.DTMManagerDefault.getDTM(DTMManagerDefault.java:184) 
>>
>>    at 
>> org.apache.xml.dtm.ref.DTMManagerDefault.getDTMHandleFromNode(DTMManagerDefault.java:438) 
>>
>>    at 
>> org.apache.xpath.XPathContext.getDTMHandleFromNode(XPathContext.java:195) 
>>
>>    at 
>> org.apache.xalan.extensions.XSLProcessorContext.outputToResultTree(XSLProcessorContext.java:257) 
>>
>>    at 
>> org.apache.xalan.extensions.ExtensionHandlerJavaClass.processElement(ExtensionHandlerJavaClass.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemExtensionCall.execute(ElemExtensionCall.java:307) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2182) 
>>
>>    at 
>> org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:678) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2182) 
>>
>>    at 
>> org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:678) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes(ElemApplyTemplates.java:423) 
>>
>>    at 
>> org.apache.xalan.templates.ElemApplyTemplates.execute(ElemApplyTemplates.java:226) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2182) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.applyTemplateToNode(TransformerImpl.java:2008) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.transformNode(TransformerImpl.java:1171) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerImpl.run(TransformerImpl.java:3135) 
>>
>>    at 
>> org.apache.xalan.transformer.TransformerHandlerImpl.endDocument(TransformerHandlerImpl.java:433) 
>>
>>    at 
>> com.pricerunner.ture.parser.HTMLReader.postParse(HTMLReader.java:252)
>>    at com.pricerunner.ture.parser.HTMLReader.parse(HTMLReader.java:208)
>>    [... my code ...]
>>
>>
>>>
>>>
>>> > When is new DTM's allocated? If I want to debug it should I modify 
>>> the
>>> > DTMManagerDefault to trace all new DTM's that is created and see who
>>> > wants to create them?
>>>
>>> The DTMManager -- which I think will always be DTMManagerDefault in 
>>> the current code, with the exception of the SQL Extension package -- 
>>> is indeed where new DTM trees are allocated; see the 
>>> getDTM(Source,....) method.
>>>
>>> DTMs are used for:
>>> The main source document
>>> Documents read using the document() function
>>> Result Tree Fragments (variables and parameters which contain nodes 
>>> rather than a simple string)
>>> The output document, in some situations related to DOMResult.
>>>
>> I guess I create a pretty much result tree fragments in my 
>> extensions, but I do it (as I said before) by changing the 
>> ContentHandler to a DOM-builder and then calling 
>> ElemTemplateElement.execute(Transformer t).
>>
>>>
>>>
>>>
>>> Should we take this back to the mailing list so others can lend 
>>> their own insights?
>>>
>> Done, sorry...
>>
>>>
>>> ______________________________________
>>> Joe Kesselman / IBM Research
>>>
>>
>>
>>
>
>
>



Re: XSLT function problem

Posted by Peter Davis <pe...@pdavis.cx>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 14 June 2002 04:48, rabii mouali wrote:
> How can i remplace an &apos; in input to \&apos; in output.

In Javascript, &apos; is only one of several characters that will cause 
problems, and must be escaped with a backslash '\'.  Here is a complete list 
of the escaped characters:

* backslash '\\' (&#92;)
* single quote '\'' (&apos;)
* double quote '\"' (&quot;)
* line-feed '\n' (&#10;)
* carriage-return '\r' (&#13;)
* tab '\t' (&9;)

The following characters should be escaped, but are illegal in XML, and so 
cannot cause a problem when using XSLT:

* form-feed '\f' (&#12;)
* backspace '\b' (&#8;)
* vertical-tab '\v' (&#11;)

As you have discovered, the translate() function cannot be used to do this, 
since it can only translate single characters.  So, you must use a recursive 
template to search for each appearance of the desired character.

The process goes like this: say you have a string:

"Hello. Goodbye."

and you want to replace every "." (period) with an '!' (exclamation point).  
The first step is to find the first period, and output the part of the string 
that comes before that.  The XSLT contains() and substring-before() functions 
are useful for this purpose.  So after the first step, you are left with 
this:

Output: "Hello"
Remaining: ". Goodbye."

Now you output the character that will replace the period:

Output: "Hello!"
Remaining: ". Goodbye."

Next, you get the part of the string that comes after the first period:

Remaining: " Goodbye."

and repeat the process:

Output: "Hello! Goodbye"
Remaining: "."

Output: "Hello! Goodbye!"
Remaining: "."

Remaining: "" (no more remaining, so stop)

This basically functions like a loop, but XSLT has no while() or for() loops.  
You have to use a recursive template -- a template that calls itself, just 
like a function that calls itself in Javascript.  The whole thing looks like 
this (for brevity, the 'xsl:' prefix has been removed from each element):

<template name="replace-string">
  <!-- search for this: -->
  <param name="search" select="string(.)"/>

  <!-- and replace it with this: -->
  <param name="replace" select="string(.)"/>

  <!-- here is the original string: -->
  <param name="string" select="string(.)"/>
  
  <choose>
    <when test="not(contains($string, $search))">
      <!-- if there are no more appearances of $search in the
        $string, output the rest of the string and stop. -->
      <value-of select="$string"/>
    </when>
    <otherwise>
      <!-- output the part of the $string that is before the
        first appearance of $search. -->
      <value-of select="substring-before($string, $search)"/>
      
      <!-- output the replacement $replace.  -->
      <value-of select="$replace"/>

      <!-- repeat the process, using the part of $string that
        comes after the first appearance of $search. -->
      <call-template name="replace-string">
        <with-param name="search" select="$search"/>
        <with-param name="replace" select="$replace"/>
        <with-param name="string" select="substring-after($string, $search)"/>
      </call-template>
    </otherwise>
  </choose>
</template>


The one limitation is that the above template can only search-and-replace one 
string at a time, but in order to escape Javascript, you have to do it for 
each of the six characters that are illegal (single-quote, backslash, 
double-quote, line-feed, carriage-return, and tab).  I've done this before, 
and here is what I came up with:

  <template name="escape-javascript">
    <param name="string" select="string(.)"/>

    <choose>
      <when test="function-available('text-utils:escape')">
        <value-of select="text-utils:escape($string)"/>
      </when>

      <otherwise>
        <!-- replace all characters not matching SingleStringCharacter
        or DoubleStringCharacter according to ECMA262.  Note: not all
        characters that should be escaped are legal XML characters:
        "\a", "\b", "\v", and "\f" are not escaped. -->
        <call-template name="replace-string">
          <with-param name="search">'</with-param>
          <with-param name="replace">\'</with-param>
          <with-param name="string">
            <call-template name="replace-string">
              <with-param name="search">"</with-param>
              <with-param name="replace">\"</with-param>
              <with-param name="string">
                <call-template name="replace-string">
                  <with-param name="search">
                    <text>#9;</text>
                  </with-param>
                  <with-param name="replace">\t</with-param>
                  <with-param name="string">
                    <call-template name="replace-string">
                      <with-param name="search">
                        <text>&#10;</text>
                      </with-param>
                      <with-param name="replace">\n</with-param>
                      <with-param name="string">
                        <call-template name="replace-string">
                          <with-param name="search">
                            <text>&#13;</text>
                          </with-param>
                          <with-param name="replace">\r</with-param>
                          <with-param name="string">
                            <call-template name="replace-string">
                              <!-- remember to do backslash first -->
                              <with-param name="search">\</with-param>
                              <with-param name="replace">\\</with-param>
                              <with-param name="string" select="$string">
                              </with-param>
                            </call-template>
                          </with-param>
                        </call-template>
                      </with-param>
                    </call-template>
                  </with-param>
                </call-template>
              </with-param>
            </call-template>
          </with-param>
        </call-template>
      </otherwise>
    </choose>
  </template>

Let me know if that needs clarification.

To use the above template, you just call it like this:

<xsl:call-template name="escape-javascript">
  <xsl:with-param name="string">The 'string' to escape.</xsl:with-param>
</xsl:call-template>

Hope that helps!

- -- 
Peter Davis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9ClFmNSZCJx7tYycRAlZfAJ9mkJxk+lp8JEss0iR+PEjjacculwCeMJLS
2e93Dp7gn+KyDSOC0vM1Z2k=
=c/Gd
-----END PGP SIGNATURE-----


XSLT function problem

Posted by rabii mouali <mo...@andil.fr>.
Hi,

Perhaps this is not a place for asking this kind of question, but i didn't
found the response anywhere.

I use xalan to transform xml file to html using xsl.

The output contain some javascript, as my xml language is french, there are
conflict between javascript and &apos; who must be \&apos;.

I try in my xsl to use translate function, but there are problem : i have an
error that there are an simple quot who's not closed.

How can i remplace an &apos; in input to \&apos; in output.

Thanks