You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commons-dev@ws.apache.org by "Grzegorz Grzybek (JIRA)" <ji...@apache.org> on 2007/10/23 09:04:57 UTC

[jira] Reopened: (WSCOMMONS-260) Invalid XML after serialization

     [ https://issues.apache.org/jira/browse/WSCOMMONS-260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grzegorz Grzybek reopened WSCOMMONS-260:
----------------------------------------


Hello!

Well, it's almost ok, but there is one more specific case - XML documents WITHOUT XML declaration have "UTF-8" encoding assumend. So there should be no:

   write(new OutputStreamWriter(out))

in else clause, when this.inputEncoding is null or empty.

I've added simple XML file with utf-8 characters and NO XML declaration to other_encoding directory (not in Apache's SVN of course :) and the test fails (XML produced contains Windows-1250 characters on my Windows/Polish os).

I think, the write methods should look like this:

   if (this.inputEncoding== null || "".equals(this.inputEncoding))
      this.inputEncoding = "utf-8";
   try {
      write(new OutputStreamWriter(out,this.inputEncoding),options);
   } catch (UnsupportedEncodingException e) {
      //log the error and just write it without the encoding
      //TO CHECK: maybe use "utf-8"?
      write(new OutputStreamWriter(out));
   }

but the this.inputEncoding = "utf-8" may not be such a good idea - I don't know wether this won't cause any problems in other parts of ws-commons-XmlSchema.
anyway - it's in accordance with XML Specification - to assume "UTF-8" encoding.

with best regards
Grzegorz Grzybek


> Invalid XML after serialization
> -------------------------------
>
>                 Key: WSCOMMONS-260
>                 URL: https://issues.apache.org/jira/browse/WSCOMMONS-260
>             Project: WS-Commons
>          Issue Type: Bug
>          Components: XmlSchema
>         Environment: any
>            Reporter: Grzegorz Grzybek
>            Assignee: Ajith Harshana Ranabahu
>            Priority: Critical
>
> org.apache.ws.commons.schema.XmlSchema.write() methods use wrong wersion of OutputStreamWriter constructor. When Schema's XML Document contains characters outside standard 'us-ascii' charset, OutputStreamWriter assumes default platform encoding (e.g. Windows-1250 in Windows/Poland). Thus, when document is UTF-8, it is being serialized in Windows-1250 encoding.
> The immediate result is error during "pretty printing" of the serialized document.
> Please use OutputStreamWriter(OutputStream os, String encoding) constructor!!!
> with best regards
> Grzegorz Grzybek

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: commons-dev-help@ws.apache.org