You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Naoki Nose (JIRA)" <ji...@codehaus.org> on 2006/04/23 10:23:19 UTC

[jira] Commented: (MNG-2148) MXParser can't handle the encoding declaration in XML declaration

    [ http://jira.codehaus.org/browse/MNG-2148?page=comments#action_64002 ] 

Naoki Nose commented on MNG-2148:
---------------------------------

I have attached additional two files, becaulse The format of my first attachment isn't diff format.

The file plexus-utils.diff is the patch file for plexus-utils. It contains  xml encoding detection code.
The file plexus-utils-test-resources.tar.gz is the test resource files for MXParserTest.java.
The format of these resource files is binary, so plexux-utils.diff could'nt contain these files.



> MXParser can't handle the encoding declaration in XML declaration 
> ------------------------------------------------------------------
>
>          Key: MNG-2148
>          URL: http://jira.codehaus.org/browse/MNG-2148
>      Project: Maven 2
>         Type: Bug

>   Components: POM
>     Reporter: Naoki Nose
>  Attachments: plexus-utils-test-resource.tar.gz, plexus-utils.diff, src.jar
>
>
> The xml pull parser in plexus-utils(MXParser.java) can't handle the encoding declaration in XML declaration.
> So, it's impossible to use an encoding different from system default encoding. This is critical in Japan, because 
> there is two commonly used encodings in Japanese environment(SJIS and EUC-JP).
> I think MXParser should handle encoding declaration in xml as described in w3c specification/
> http://www.w3.org/TR/REC-xml/#sec-guessing
> I tried to fix this problem(see attachment).
> I changed the setInput(InputStream) method to detect encoding in xml declaration.
> For writing this code, I referred to source code of Apache Xerces.
> UCS-4 and UCS-2 isn't supported in this implementation, because
> these encoding isn't supported by Sun JDK.
> Xerces solves this problem by providing original reader for these encodings. I think Xerces's solution is
> too complex for plexus-utils.
> To solve this issue, it's not sufficient only to change plexus-utils, because
> DefaultMavenProjectBuilder reads POM by FileReader without specifying encoding.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira