You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Stephen White (JIRA)" <xe...@xml.apache.org> on 2007/05/04 14:12:15 UTC

[jira] Created: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
-----------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: XERCESJ-1248
                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
             Project: Xerces2-J
          Issue Type: Bug
          Components: DOM (Level 2 Events)
    Affects Versions: 2.9.0
         Environment: Applies to all environments
            Reporter: Stephen White
             Fix For: 2.9.0


AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.

In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.

Threading:

This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.

Mutations of independant documents from within a mutation listener:

This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.

Memory leak:

If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.


I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.

---- BEGIN SAMPLE CODE ---
package com.decisionsoft.xercesbug;

import java.io.ByteArrayInputStream;

import javax.xml.parsers.DocumentBuilder;

import org.apache.xerces.dom.AttrImpl;
import org.apache.xerces.dom.DocumentImpl;
import org.apache.xerces.dom.ElementImpl;
import org.apache.xerces.dom.events.MutationEventImpl;
import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.events.Event;
import org.w3c.dom.events.EventListener;

public class Bug {
  // Null EventListener.  Just so that we can register something.
  static class NullEventListener implements EventListener {
    public void handleEvent(Event evt) {
      System.err.println("[ NullEventListener: ignoring event ]");
    }
  }

  // EventListener that mutates an unrelated document when an event is received.
  static class MyEventListener implements EventListener {
    private Document _doc;

    public MyEventListener(Document doc) {
      _doc = doc;
    }

    public void handleEvent(Event evt) {
      System.err.println("MyEventListener: start *******");
      displayEvent(evt);
      System.err.println("MyEventListener: changing unrelated document");
      Element ele = (Element) _doc.getDocumentElement().getFirstChild();
      ele.setAttribute("xml2attr", "NewValueSetInEventListener");
      System.err.println("MyEventListener: displaying original event again");
      displayEvent(evt);
      System.err.println("MyEventListener: end *******");
    }

  }

  // Utility method to display an event
  public static void displayEvent(Event evt) {
    System.err.println("  Event: " + evt + ", on " + evt.getTarget());
    if (evt instanceof MutationEventImpl) {
      MutationEventImpl mutation = (MutationEventImpl) evt;
      System.err.println("  + Related: " + mutation.getRelatedNode());
    }
  }

  /**
   * @param args
   */
  public static void main(String[] args) throws Exception {
    String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
    String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";

    DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
    DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
    DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));

    ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
    AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
    attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);

    doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);

    Element e = (Element) doc1.getDocumentElement().getFirstChild();
    e.setAttribute("xml1attr", "xml1AttrNewValue");
  }
}
---- END SAMPLE CODE ----

Running this with Xerces 2.9.0 gives:
MyEventListener: start *******
  Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
  + Related: xml1attr="xml1AttrValue"
MyEventListener: changing unrelated document
[ NullEventListener: ignoring event ]
MyEventListener: displaying original event again
  Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
  + Related: xml1attr="xml2AttrValue"
MyEventListener: end *******

Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).

The corrent output would be:
MyEventListener: start *******
  Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
  + Related: xml1attr="xml1AttrValue"
MyEventListener: changing unrelated document
[ NullEventListener: ignoring event ]
MyEventListener: displaying original event again
  Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
  + Related: xml1attr="xml1AttrValue"
MyEventListener: end *******

Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.

The patch I created to correct this issue is:
diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
--- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
+++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
@@ -138,8 +138,6 @@
     // REVISIT: we are losing the type information in DOM during serialization
     transient Object type;

-    protected static TextImpl textNode = null;
-
     //
     // Constructors
     //
@@ -361,13 +359,8 @@
                     oldvalue = (String) value;
                     // create an actual text node as our child so
                     // that we can use it in the event
-                    if (textNode == null) {
-                        textNode = (TextImpl)
+                    TextImpl textNode = (TextImpl)
                             ownerDocument.createTextNode((String) value);
-                    }
-                    else {
-                        textNode.data = (String) value;
-                    }
                     value = textNode;
                     textNode.isFirstChild(true);
                     textNode.previousSibling = textNode;

I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Updated: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich updated XERCESJ-1248:
------------------------------------------

    Fix Version/s:     (was: 2.9.0)

Removing the "Fix Version". This bug hasn't been fixed yet.

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890331#action_12890331 ] 

Michael Glavassevich commented on XERCESJ-1248:
-----------------------------------------------

Xerces-J 2.6.2 is over 6 years old.  There have been hundreds of improvements / fixes since then.  I would suggest that you try the most recent release, but keep in mind that a "leak" involving that many DOM objects is likely due to a problem in the application using the DOM, either creating a DOM large programatically or parsing a huge XML document and not making that DOM available for garbage collection when it's done with it.

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>            Assignee: Michael Glavassevich
>             Fix For: 2.10.0
>
>         Attachments: staticTextNodePatch.txt, xerces_2.6.2.doc
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Updated: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "nareshreddy (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

nareshreddy updated XERCESJ-1248:
---------------------------------

    Attachment: xerces_2.6.2.doc

Attached is memory leak suspect.

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>            Assignee: Michael Glavassevich
>             Fix For: 2.10.0
>
>         Attachments: staticTextNodePatch.txt, xerces_2.6.2.doc
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Resolved: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich resolved XERCESJ-1248.
-------------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.10.0

Ludger, thanks for the patch. I reviewed it yesterday evening. It looked good and I just committed it to SVN (see rev 924245) with one minor change: setting the data field directly on the text node instead of calling setNodeValueInternal().

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>            Assignee: Michael Glavassevich
>             Fix For: 2.10.0
>
>         Attachments: staticTextNodePatch.txt
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Randall Theobald (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845924#action_12845924 ] 

Randall Theobald commented on XERCESJ-1248:
-------------------------------------------

Again, is there any plan to fix this issue? Let me provide some details about how this is impacting our product.

After an operation has completed with a specific workload (other workloads are also affected), the static 'textNode' field is still populated which ultimately is keeping 82 MB of objects alive in the heap, which is 32% of our total live heap. This is unacceptable. I measured this by clearing out this field using reflection once after the operation was complete, which is obviously a bad hack. However, given the large impact, we may incorporate such a hack into our product.



> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Ludger Bünger (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845985#action_12845985 ] 

Ludger Bünger commented on XERCESJ-1248:
----------------------------------------

Hi Stephen,

I am not sure whether you are refering to my comment since I am not proposing to introduce a new field to AttrImpl...
Therefore I am not sure what your comments about expensive RAM use are referring to.

The difference between your and my suggestion is that in addition I suggest to reuse the (method local!) text node you create later on within the same method where a text node would have to be created anyhow.
And thus that would impose no performance penalty of one additional text node created compared to the current "static text node" while getting rid of the static text node.

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>         Attachments: staticTextNodePatch.txt
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Assigned: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Glavassevich reassigned XERCESJ-1248:
---------------------------------------------

    Assignee: Michael Glavassevich

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>            Assignee: Michael Glavassevich
>         Attachments: staticTextNodePatch.txt
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Issue Comment Edited: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Michael Glavassevich (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890331#action_12890331 ] 

Michael Glavassevich edited comment on XERCESJ-1248 at 7/20/10 1:34 PM:
------------------------------------------------------------------------

Xerces-J 2.6.2 is over 6 years old.  There have been hundreds of improvements / fixes since then.  I would suggest that you try the most recent release, but keep in mind that a "leak" involving that many DOM objects is likely due to a problem in the application using the DOM, either creating a large DOM programatically or parsing a huge XML document and not making that DOM available for garbage collection when it's done with it.

      was (Author: mrglavas@ca.ibm.com):
    Xerces-J 2.6.2 is over 6 years old.  There have been hundreds of improvements / fixes since then.  I would suggest that you try the most recent release, but keep in mind that a "leak" involving that many DOM objects is likely due to a problem in the application using the DOM, either creating a DOM large programatically or parsing a huge XML document and not making that DOM available for garbage collection when it's done with it.
  
> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>            Assignee: Michael Glavassevich
>             Fix For: 2.10.0
>
>         Attachments: staticTextNodePatch.txt, xerces_2.6.2.doc
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "nareshreddy (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890319#action_12890319 ] 

nareshreddy commented on XERCESJ-1248:
--------------------------------------

Hi,

We use xerces_2.6.2 version and observed around 500 MB leak for org.apache.xerces.dom.ElementImpl class. Could you let u sknow if this issue effects even for  xerces_2.6.2 version.

I have attached xerces_2.6.2.doc which shows as memory leak.

~Naresh

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>            Assignee: Michael Glavassevich
>             Fix For: 2.10.0
>
>         Attachments: staticTextNodePatch.txt
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Randall Theobald (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513585 ] 

Randall Theobald commented on XERCESJ-1248:
-------------------------------------------

We are seeing negative memory side-effects that we've tracked to this issue in (at least) one of our products.


> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Updated: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Ludger Bünger (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ludger Bünger updated XERCESJ-1248:
-----------------------------------

    Attachment: staticTextNodePatch.txt

Attached find a small patch that would fix the issue.

The static node is used to reduce the number of text node creations (if mutation events are present).

However AttrImpl.setValue(String) does always create a text node (if mutation events are present) later on in to notify event listeners about node insertions anyhow.

So what we can do is create only one instance of a text node and use it twice for node removal notification and node insertion notification instead of reusing a static referenced text node.

Result: same number of text nodes created while ommiting a static reference (which is always undesirable).

In addition I removed two meanwhile obsolete methods one of them modified the static textNode field too.

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>         Attachments: staticTextNodePatch.txt
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Stephen White (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845990#action_12845990 ] 

Stephen White commented on XERCESJ-1248:
----------------------------------------

Sorry, yes, I misread your patch.  Ignore my previous comment, your change looks fine to me.

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>         Attachments: staticTextNodePatch.txt
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


[jira] Commented: (XERCESJ-1248) Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks

Posted by "Stephen White (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESJ-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845973#action_12845973 ] 

Stephen White commented on XERCESJ-1248:
----------------------------------------

Using a non-static textNode increases the memory use of all AttrImpls (one per attribute in the document) by 4 bytes.  In large documents or those with a large number of attributes this may be an unacceptable overhead to gain a small performance advantage in certain (pretty rare) circumstances.  Many people will use DOM without ever adding mutation listeners, and we don't want a fix that negatively impacts performance for them.  DOM is already expensive in terms of RAM use.

Also, using a non-static textNode eliminates the main performance gain that the static version gave ... that is when firing mutations events for a large number of elements only one single TextNode needs constructing in total, rather than one (or more) per attribute object affected.

These issues are why my patch simply erradicated the textNode altogether.  On modern JVMs the allocation/deallocation of the TextNode really shouldn't be prohibitively expensive in real-world situations.

> Static textNode on AttrImpl breaks thread-safety, mutations of independant documents from within a mutation listener, and introduces potential memory leaks
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1248
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1248
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DOM (Level 2 Events)
>    Affects Versions: 2.9.0
>         Environment: Applies to all environments
>            Reporter: Stephen White
>         Attachments: staticTextNodePatch.txt
>
>
> AttrImpl has two modes of operation, dependant on the hasStringValue() method.  This difference allows faster, lighter, operation in many cases .. as attribute values can be stored as Strings.  In some situations, notably DOM Events and the inclusion of entity references in Attribute values, the value must be stored 'correctly' as one or more DOM Nodes rather than as a String.
> In some Situations an AttrImpl may be operating in the 'String based' mode, but then switch to the 'Node based' mode.  This happens when a Mutation listener is registered on the Document and then a mutation event affecting the attribution in question is fired.  This requires the creation of a TextImpl Node to pass to the event listener.  This is all fine; however the author appears to have been concerned about the number of TextImpls that could potentially be constructed to pass to the event listener and then discarded immediately afterwards.  Unfortunately the author introduced a static TextImpl called textNode which is used, and re-used, to avoid the repeated creation of TextImpls.  This static field introduces several problems.
> Threading:
> This static field could potential be used by multiple independant nodes in independant documents.  If these documents are manipulated in seperate threads then this same static field could potentially be used concurrently by multiple threads and so introduce a problem.
> Mutations of independant documents from within a mutation listener:
> This is a subtle issue, but is 100% re-producable.  A mutation listener for one document can modify an independant document.  As the documents are totally independant this shouldn't cause a problem.  However, if both the mutation that triggered the event and the mutation that is initiated inside the event listener both utilise the static field on AttrImpl then the event listener will see its 'related node' change as an erroneous side-effect of the mutation to the independant document.
> Memory leak:
> If the static field is utilised during the mutation of a DOM document then that document will potentially be held in memory forever, though the 'owerNode' reference on the TextImpl object referenced from the static field.  As this static field is only used in certain specific circumstances it is possible for an application to continue processing new documents without this field being reset, and therefore the application will unwittingly continue to reference the old document.
> I have produced an example application which demonstrates that manipulating an independant document from within a mutation listener can have a side effect that modifies the original event object.
> ---- BEGIN SAMPLE CODE ---
> package com.decisionsoft.xercesbug;
> import java.io.ByteArrayInputStream;
> import javax.xml.parsers.DocumentBuilder;
> import org.apache.xerces.dom.AttrImpl;
> import org.apache.xerces.dom.DocumentImpl;
> import org.apache.xerces.dom.ElementImpl;
> import org.apache.xerces.dom.events.MutationEventImpl;
> import org.apache.xerces.jaxp.DocumentBuilderFactoryImpl;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.events.Event;
> import org.w3c.dom.events.EventListener;
> public class Bug {
>   // Null EventListener.  Just so that we can register something.
>   static class NullEventListener implements EventListener {
>     public void handleEvent(Event evt) {
>       System.err.println("[ NullEventListener: ignoring event ]");
>     }
>   }
>   // EventListener that mutates an unrelated document when an event is received.
>   static class MyEventListener implements EventListener {
>     private Document _doc;
>     public MyEventListener(Document doc) {
>       _doc = doc;
>     }
>     public void handleEvent(Event evt) {
>       System.err.println("MyEventListener: start *******");
>       displayEvent(evt);
>       System.err.println("MyEventListener: changing unrelated document");
>       Element ele = (Element) _doc.getDocumentElement().getFirstChild();
>       ele.setAttribute("xml2attr", "NewValueSetInEventListener");
>       System.err.println("MyEventListener: displaying original event again");
>       displayEvent(evt);
>       System.err.println("MyEventListener: end *******");
>     }
>   }
>   // Utility method to display an event
>   public static void displayEvent(Event evt) {
>     System.err.println("  Event: " + evt + ", on " + evt.getTarget());
>     if (evt instanceof MutationEventImpl) {
>       MutationEventImpl mutation = (MutationEventImpl) evt;
>       System.err.println("  + Related: " + mutation.getRelatedNode());
>     }
>   }
>   /**
>    * @param args
>    */
>   public static void main(String[] args) throws Exception {
>     String xml1 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><e xml1attr=\"xml1AttrValue\"/></root>\n";
>     String xml2 = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root><f xml2attr=\"xml2AttrValue\"/></root>\n";
>     DocumentBuilder db = DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder();
>     DocumentImpl doc1 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml1.getBytes()));
>     DocumentImpl doc2 = (DocumentImpl) db.parse(new ByteArrayInputStream(xml2.getBytes()));
>     ElementImpl ele = (ElementImpl) doc1.getDocumentElement().getFirstChild();
>     AttrImpl attr = (AttrImpl) ele.getAttributeNode("xml1attr");
>     attr.addEventListener(MutationEventImpl.DOM_NODE_REMOVED, new MyEventListener(doc2), false);
>     doc2.addEventListener(MutationEventImpl.DOM_ATTR_MODIFIED, new NullEventListener(), true);
>     Element e = (Element) doc1.getDocumentElement().getFirstChild();
>     e.setAttribute("xml1attr", "xml1AttrNewValue");
>   }
> }
> ---- END SAMPLE CODE ----
> Running this with Xerces 2.9.0 gives:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml2AttrValue]
>   + Related: xml1attr="xml2AttrValue"
> MyEventListener: end *******
> Notice that the event has changed during handleEvent method.  At the end of the handleEvent method the 'relatedNode' on the event is wrong, it proclaims to relate to the attirubte 'xml1attr' with the value 'xml2AttrValue' (a value which that attribute never contains).
> The corrent output would be:
> MyEventListener: start *******
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: changing unrelated document
> [ NullEventListener: ignoring event ]
> MyEventListener: displaying original event again
>   Event: org.apache.xerces.dom.events.MutationEventImpl@17172ea, on [#text: xml1AttrValue]
>   + Related: xml1attr="xml1AttrValue"
> MyEventListener: end *******
> Here the mutation of the independant DOM document has not had the side effect of corrupting the event being handled by the mutation listener.
> The patch I created to correct this issue is:
> diff -Naur xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java
> --- xerces-2_9_0.orig/src/org/apache/xerces/dom/AttrImpl.java 2006-11-22 23:36:58.000000000 +0000
> +++ xerces-2_9_0/src/org/apache/xerces/dom/AttrImpl.java  2007-05-04 12:20:22.000000000 +0100
> @@ -138,8 +138,6 @@
>      // REVISIT: we are losing the type information in DOM during serialization
>      transient Object type;
> -    protected static TextImpl textNode = null;
> -
>      //
>      // Constructors
>      //
> @@ -361,13 +359,8 @@
>                      oldvalue = (String) value;
>                      // create an actual text node as our child so
>                      // that we can use it in the event
> -                    if (textNode == null) {
> -                        textNode = (TextImpl)
> +                    TextImpl textNode = (TextImpl)
>                              ownerDocument.createTextNode((String) value);
> -                    }
> -                    else {
> -                        textNode.data = (String) value;
> -                    }
>                      value = textNode;
>                      textNode.isFirstChild(true);
>                      textNode.previousSibling = textNode;
> I believe that this patch corrects the issue described in the bug.  The corrected code will instantiate a greater number of TextImpl objects in some situations, however I believe these situations rare and the potential (small) performance penalty is outweighed by the desire for the code to be correct.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org