You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by ji...@apache.org on 2004/04/12 23:15:59 UTC

[jira] Updated: (XERCERJ-119) DocumentBuilder parse() produces a mangled Document.

The following issue has been updated:

    Updater: Serge Knystautas (mailto:sergek@lokitech.com)
       Date: Mon, 12 Apr 2004 2:14 PM
    Changes:
             Attachment changed from dtm.xml
    ---------------------------------------------------------------------
For a full history of the issue, see:

  http://issues.apache.org/jira/browse/XERCERJ-119?page=history

---------------------------------------------------------------------
View the issue:
  http://issues.apache.org/jira/browse/XERCERJ-119

Here is an overview of the issue:
---------------------------------------------------------------------
        Key: XERCERJ-119
    Summary: DocumentBuilder parse() produces a mangled Document.
       Type: Bug

     Status: Resolved
 Resolution: CANNOT REPRODUCE

    Project: Xerces2-J

   Assignee: Xerces-J Developers Mailing List
   Reporter: Donald Leslie

    Created: Wed, 16 Jan 2002 9:20 AM
    Updated: Mon, 12 Apr 2004 2:14 PM
Environment: Operating System: Other
Platform: PC

Description:
1. Instantate a DocumentBuilder (docBuilder).
2. Call docBuilder.parse() with a String URI.
3. Serialize the Document returned by the parse operation.

In some of the resulting blocks of text (i.e., from individual text nodes), the 
final characters in the text block appear at the beginning of the text block.
For example the following text in the Xalan dtm.xml document

 <p>The Document Table Model (DTM) is an interface to a Document Model designed 
specifically for the needs of our XPath and XSLT implementations. The motivation 
behind this model is to optimize performance and minimize storage.</p>

appears as follows when the document is parsed and the resulting Document 
serialized:

<p>implementations. The motivation behind this model is to optimize performance 
and minimize storage.The Document Table Model (DTM) is an interface to a 
Document Model designed specifically for the needs of our XPath and XSLT </p>

Note: This bug appeared when we tried to use StyleBook (updated to call JAXP 
rather than Xerces directly) with Xerces-J2beta4 to produce the Xalan-J 
documentation. 

Here is a simplified example named Test. Check out xml-xalan/java, and run Test 
from xml-xalan/java/test.

import org.w3c.dom.Document;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException; 
import org.apache.xalan.serialize.Serializer;
import org.apache.xalan.serialize.SerializerFactory;
import org.apache.xalan.templates.OutputProperties;
import java.io.IOException;
import java.io.FileNotFoundException;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import java.io.FileOutputStream;

public class Test
{
  public static void main(String argv[])
    throws IOException, SAXException, ParserConfigurationException
  {
    DocumentBuilderFactory dFactory = DocumentBuilderFactory.newInstance();
    dFactory.setNamespaceAware(true);
      
    DocumentBuilder dBuilder = dFactory.newDocumentBuilder();
      
    Document doc = dBuilder.parse("..\\xdocs\\sources\\xalan\\dtm.xml");

    Serializer serializer = SerializerFactory.getSerializer                      
      (OutputProperties.getDefaultMethodProperties("xml"));
    serializer.setOutputStream(new FileOutputStream("dtm_out.xml"));
    serializer.asDOMSerializer().serialize(doc);
  }
}


---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa

If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org