You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Michael Glavassevich (JIRA)" <xa...@xml.apache.org> on 2006/11/15 07:47:37 UTC

[jira] Created: (XALANJ-2337) [PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements

[PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements
----------------------------------------------------------

                 Key: XALANJ-2337
                 URL: http://issues.apache.org/jira/browse/XALANJ-2337
             Project: XalanJ2
          Issue Type: Bug
          Components: DOM, Serialization
    Affects Versions: Latest Development Code
            Reporter: Michael Glavassevich
         Attachments: dom3-ls-serializer-fixes.patch.txt

The attached patch addresses the following issues:

The DOM Level 3 Load and Save specification requires that implementations choose a default sequence [1] which matches one allowed by XML 1.0 (or XML 1.1). The current code blindly accepts whatever the value of "line.separator" is. If the value of "line.separator" isn't one of the XML 1.0 end-of-line sequences then we should select "\n" as the default value.

Setting the newLine attribute [1] to null on an LSSerializer is supposed to restore the default value. The current code isn't doing that.

Support for the format-pretty-print feature is broken. Setting it to true with setParameter() has no effect. The wrong properties are being set internally. We need to set {http://www.w3.org/TR/DOM-Level-3-LS}format-pretty-print in order to enable the feature.

Revision 474947 to DOM3TreeWalker fixed a bug where a char[] was being converted to a String using toString(). This probably creeped in due to the repeated conversions of the end of line sequence (String -> char[] -> String -> char[]). I don't understand why we keep flipping back and forth between String and char[] but it's a waste to create the temp string every time in the TreeWalker. I moved the conversion into the setter on DOM3SerializerImpl.

LSSerializerImpl is creating OutputStreamWriters directly. This bypasses the underlying serializer's encoding handling and prevents it from using its optimized writers for UTF-8 and ASCII. I've changed the code so that it just sets the OutputStream and lets the real serializer deal with the encoding issues.

When writing to a file URI LSSerializerImpl needs to decode the URI escape sequences before it creates the FileOutputStream. For a URI like "file:///D:/My%20Documents/file.xml" we should be writing this to "D:\My Documents\file.xml" not "D:\My%20Documents\file.xml". The proposed change is based on code which has been in the Xerces DOM Level 3 serializer for over two years.

On Java 1.4 we should be using the exception chaining mechanism to capture the cause of the LSException. This was implemented in Xerces 2.8.0 and should be carried forward into the Xalan based serializer.

There are several printStackTrace() calls scattered around LSSerializerImpl. We should never be making these. I've removed all of them.

[1] http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSSerializer-newLine

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Resolved: (XALANJ-2337) [PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2337?page=all ]

Brian Minchau resolved XALANJ-2337.
-----------------------------------

    Fix Version/s: Latest Development Code
       Resolution: Fixed

I have reviewed Michaels' patch and I approve it. I have committed it to the latest development code.

> [PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements
> ----------------------------------------------------------
>
>                 Key: XALANJ-2337
>                 URL: http://issues.apache.org/jira/browse/XALANJ-2337
>             Project: XalanJ2
>          Issue Type: Bug
>          Components: Serialization, DOM
>    Affects Versions: Latest Development Code
>            Reporter: Michael Glavassevich
>             Fix For: Latest Development Code
>
>         Attachments: dom3-ls-serializer-fixes.patch.txt
>
>
> The attached patch addresses the following issues:
> The DOM Level 3 Load and Save specification requires that implementations choose a default sequence [1] which matches one allowed by XML 1.0 (or XML 1.1). The current code blindly accepts whatever the value of "line.separator" is. If the value of "line.separator" isn't one of the XML 1.0 end-of-line sequences then we should select "\n" as the default value.
> Setting the newLine attribute [1] to null on an LSSerializer is supposed to restore the default value. The current code isn't doing that.
> Support for the format-pretty-print feature is broken. Setting it to true with setParameter() has no effect. The wrong properties are being set internally. We need to set {http://www.w3.org/TR/DOM-Level-3-LS}format-pretty-print in order to enable the feature.
> Revision 474947 to DOM3TreeWalker fixed a bug where a char[] was being converted to a String using toString(). This probably creeped in due to the repeated conversions of the end of line sequence (String -> char[] -> String -> char[]). I don't understand why we keep flipping back and forth between String and char[] but it's a waste to create the temp string every time in the TreeWalker. I moved the conversion into the setter on DOM3SerializerImpl.
> LSSerializerImpl is creating OutputStreamWriters directly. This bypasses the underlying serializer's encoding handling and prevents it from using its optimized writers for UTF-8 and ASCII. I've changed the code so that it just sets the OutputStream and lets the real serializer deal with the encoding issues.
> When writing to a file URI LSSerializerImpl needs to decode the URI escape sequences before it creates the FileOutputStream. For a URI like "file:///D:/My%20Documents/file.xml" we should be writing this to "D:\My Documents\file.xml" not "D:\My%20Documents\file.xml". The proposed change is based on code which has been in the Xerces DOM Level 3 serializer for over two years.
> On Java 1.4 we should be using the exception chaining mechanism to capture the cause of the LSException. This was implemented in Xerces 2.8.0 and should be carried forward into the Xalan based serializer.
> There are several printStackTrace() calls scattered around LSSerializerImpl. We should never be making these. I've removed all of them.
> [1] http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSSerializer-newLine

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Commented: (XALANJ-2337) [PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements

Posted by "Michael Glavassevich (JIRA)" <xa...@xml.apache.org>.
    [ http://issues.apache.org/jira/browse/XALANJ-2337?page=comments#action_12450456 ] 
            
Michael Glavassevich commented on XALANJ-2337:
----------------------------------------------

Thanks Brian.  I see all the fixes are in.  You can close this JIRA issue.

> [PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements
> ----------------------------------------------------------
>
>                 Key: XALANJ-2337
>                 URL: http://issues.apache.org/jira/browse/XALANJ-2337
>             Project: XalanJ2
>          Issue Type: Bug
>          Components: Serialization, DOM
>    Affects Versions: Latest Development Code
>            Reporter: Michael Glavassevich
>             Fix For: Latest Development Code
>
>         Attachments: dom3-ls-serializer-fixes.patch.txt
>
>
> The attached patch addresses the following issues:
> The DOM Level 3 Load and Save specification requires that implementations choose a default sequence [1] which matches one allowed by XML 1.0 (or XML 1.1). The current code blindly accepts whatever the value of "line.separator" is. If the value of "line.separator" isn't one of the XML 1.0 end-of-line sequences then we should select "\n" as the default value.
> Setting the newLine attribute [1] to null on an LSSerializer is supposed to restore the default value. The current code isn't doing that.
> Support for the format-pretty-print feature is broken. Setting it to true with setParameter() has no effect. The wrong properties are being set internally. We need to set {http://www.w3.org/TR/DOM-Level-3-LS}format-pretty-print in order to enable the feature.
> Revision 474947 to DOM3TreeWalker fixed a bug where a char[] was being converted to a String using toString(). This probably creeped in due to the repeated conversions of the end of line sequence (String -> char[] -> String -> char[]). I don't understand why we keep flipping back and forth between String and char[] but it's a waste to create the temp string every time in the TreeWalker. I moved the conversion into the setter on DOM3SerializerImpl.
> LSSerializerImpl is creating OutputStreamWriters directly. This bypasses the underlying serializer's encoding handling and prevents it from using its optimized writers for UTF-8 and ASCII. I've changed the code so that it just sets the OutputStream and lets the real serializer deal with the encoding issues.
> When writing to a file URI LSSerializerImpl needs to decode the URI escape sequences before it creates the FileOutputStream. For a URI like "file:///D:/My%20Documents/file.xml" we should be writing this to "D:\My Documents\file.xml" not "D:\My%20Documents\file.xml". The proposed change is based on code which has been in the Xerces DOM Level 3 serializer for over two years.
> On Java 1.4 we should be using the exception chaining mechanism to capture the cause of the LSException. This was implemented in Xerces 2.8.0 and should be carried forward into the Xalan based serializer.
> There are several printStackTrace() calls scattered around LSSerializerImpl. We should never be making these. I've removed all of them.
> [1] http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSSerializer-newLine

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[jira] Closed: (XALANJ-2337) [PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements

Posted by "Brian Minchau (JIRA)" <xa...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XALANJ-2337?page=all ]

Brian Minchau closed XALANJ-2337.
---------------------------------


Closing this issue per Michael's request.

> [PATCH]: DOM Level 3 Serializer Bug Fixes and Improvements
> ----------------------------------------------------------
>
>                 Key: XALANJ-2337
>                 URL: http://issues.apache.org/jira/browse/XALANJ-2337
>             Project: XalanJ2
>          Issue Type: Bug
>          Components: DOM, Serialization
>    Affects Versions: Latest Development Code
>            Reporter: Michael Glavassevich
>             Fix For: Latest Development Code
>
>         Attachments: dom3-ls-serializer-fixes.patch.txt
>
>
> The attached patch addresses the following issues:
> The DOM Level 3 Load and Save specification requires that implementations choose a default sequence [1] which matches one allowed by XML 1.0 (or XML 1.1). The current code blindly accepts whatever the value of "line.separator" is. If the value of "line.separator" isn't one of the XML 1.0 end-of-line sequences then we should select "\n" as the default value.
> Setting the newLine attribute [1] to null on an LSSerializer is supposed to restore the default value. The current code isn't doing that.
> Support for the format-pretty-print feature is broken. Setting it to true with setParameter() has no effect. The wrong properties are being set internally. We need to set {http://www.w3.org/TR/DOM-Level-3-LS}format-pretty-print in order to enable the feature.
> Revision 474947 to DOM3TreeWalker fixed a bug where a char[] was being converted to a String using toString(). This probably creeped in due to the repeated conversions of the end of line sequence (String -> char[] -> String -> char[]). I don't understand why we keep flipping back and forth between String and char[] but it's a waste to create the temp string every time in the TreeWalker. I moved the conversion into the setter on DOM3SerializerImpl.
> LSSerializerImpl is creating OutputStreamWriters directly. This bypasses the underlying serializer's encoding handling and prevents it from using its optimized writers for UTF-8 and ASCII. I've changed the code so that it just sets the OutputStream and lets the real serializer deal with the encoding issues.
> When writing to a file URI LSSerializerImpl needs to decode the URI escape sequences before it creates the FileOutputStream. For a URI like "file:///D:/My%20Documents/file.xml" we should be writing this to "D:\My Documents\file.xml" not "D:\My%20Documents\file.xml". The proposed change is based on code which has been in the Xerces DOM Level 3 serializer for over two years.
> On Java 1.4 we should be using the exception chaining mechanism to capture the cause of the LSException. This was implemented in Xerces 2.8.0 and should be carried forward into the Xalan based serializer.
> There are several printStackTrace() calls scattered around LSSerializerImpl. We should never be making these. I've removed all of them.
> [1] http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSSerializer-newLine

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org