You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Naoki Nose (JIRA)" <ji...@codehaus.org> on 2006/03/12 17:28:29 UTC

[jira] Created: (MNG-2148) MXParser can't handle the encoding declaration in XML declaration

MXParser can't handle the encoding declaration in XML declaration 
------------------------------------------------------------------

         Key: MNG-2148
         URL: http://jira.codehaus.org/browse/MNG-2148
     Project: Maven 2
        Type: Bug

  Components: POM  
    Reporter: Naoki Nose
 Attachments: src.jar

The xml pull parser in plexus-utils(MXParser.java) can't handle the encoding declaration in XML declaration.
So, it's impossible to use an encoding different from system default encoding. This is critical in Japan, because 
there is two commonly used encodings in Japanese environment(SJIS and EUC-JP).

I think MXParser should handle encoding declaration in xml as described in w3c specification/
http://www.w3.org/TR/REC-xml/#sec-guessing

I tried to fix this problem(see attachment).
I changed the setInput(InputStream) method to detect encoding in xml declaration.
For writing this code, I referred to source code of Apache Xerces.
UCS-4 and UCS-2 isn't supported in this implementation, because
these encoding isn't supported by Sun JDK.

Xerces solves this problem by providing original reader for these encodings. I think Xerces's solution is
too complex for plexus-utils.

To solve this issue, it's not sufficient only to change plexus-utils, because
DefaultMavenProjectBuilder reads POM by FileReader without specifying encoding.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (MNG-2148) MXParser can't handle the encoding declaration in XML declaration

Posted by "Naoki Nose (JIRA)" <ji...@codehaus.org>.
    [ http://jira.codehaus.org/browse/MNG-2148?page=comments#action_64002 ] 

Naoki Nose commented on MNG-2148:
---------------------------------

I have attached additional two files, becaulse The format of my first attachment isn't diff format.

The file plexus-utils.diff is the patch file for plexus-utils. It contains  xml encoding detection code.
The file plexus-utils-test-resources.tar.gz is the test resource files for MXParserTest.java.
The format of these resource files is binary, so plexux-utils.diff could'nt contain these files.



> MXParser can't handle the encoding declaration in XML declaration 
> ------------------------------------------------------------------
>
>          Key: MNG-2148
>          URL: http://jira.codehaus.org/browse/MNG-2148
>      Project: Maven 2
>         Type: Bug

>   Components: POM
>     Reporter: Naoki Nose
>  Attachments: plexus-utils-test-resource.tar.gz, plexus-utils.diff, src.jar
>
>
> The xml pull parser in plexus-utils(MXParser.java) can't handle the encoding declaration in XML declaration.
> So, it's impossible to use an encoding different from system default encoding. This is critical in Japan, because 
> there is two commonly used encodings in Japanese environment(SJIS and EUC-JP).
> I think MXParser should handle encoding declaration in xml as described in w3c specification/
> http://www.w3.org/TR/REC-xml/#sec-guessing
> I tried to fix this problem(see attachment).
> I changed the setInput(InputStream) method to detect encoding in xml declaration.
> For writing this code, I referred to source code of Apache Xerces.
> UCS-4 and UCS-2 isn't supported in this implementation, because
> these encoding isn't supported by Sun JDK.
> Xerces solves this problem by providing original reader for these encodings. I think Xerces's solution is
> too complex for plexus-utils.
> To solve this issue, it's not sufficient only to change plexus-utils, because
> DefaultMavenProjectBuilder reads POM by FileReader without specifying encoding.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (MNG-2148) MXParser can't handle the encoding declaration in XML declaration

Posted by "Naoki Nose (JIRA)" <ji...@codehaus.org>.
     [ http://jira.codehaus.org/browse/MNG-2148?page=all ]

Naoki Nose updated MNG-2148:
----------------------------

    Attachment: plexus-utils.diff

> MXParser can't handle the encoding declaration in XML declaration 
> ------------------------------------------------------------------
>
>          Key: MNG-2148
>          URL: http://jira.codehaus.org/browse/MNG-2148
>      Project: Maven 2
>         Type: Bug

>   Components: POM
>     Reporter: Naoki Nose
>  Attachments: plexus-utils.diff, src.jar
>
>
> The xml pull parser in plexus-utils(MXParser.java) can't handle the encoding declaration in XML declaration.
> So, it's impossible to use an encoding different from system default encoding. This is critical in Japan, because 
> there is two commonly used encodings in Japanese environment(SJIS and EUC-JP).
> I think MXParser should handle encoding declaration in xml as described in w3c specification/
> http://www.w3.org/TR/REC-xml/#sec-guessing
> I tried to fix this problem(see attachment).
> I changed the setInput(InputStream) method to detect encoding in xml declaration.
> For writing this code, I referred to source code of Apache Xerces.
> UCS-4 and UCS-2 isn't supported in this implementation, because
> these encoding isn't supported by Sun JDK.
> Xerces solves this problem by providing original reader for these encodings. I think Xerces's solution is
> too complex for plexus-utils.
> To solve this issue, it's not sufficient only to change plexus-utils, because
> DefaultMavenProjectBuilder reads POM by FileReader without specifying encoding.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (MNG-2148) MXParser can't handle the encoding declaration in XML declaration

Posted by "Naoki Nose (JIRA)" <ji...@codehaus.org>.
     [ http://jira.codehaus.org/browse/MNG-2148?page=all ]

Naoki Nose updated MNG-2148:
----------------------------

    Attachment: plexus-utils-test-resource.tar.gz

> MXParser can't handle the encoding declaration in XML declaration 
> ------------------------------------------------------------------
>
>          Key: MNG-2148
>          URL: http://jira.codehaus.org/browse/MNG-2148
>      Project: Maven 2
>         Type: Bug

>   Components: POM
>     Reporter: Naoki Nose
>  Attachments: plexus-utils-test-resource.tar.gz, plexus-utils.diff, src.jar
>
>
> The xml pull parser in plexus-utils(MXParser.java) can't handle the encoding declaration in XML declaration.
> So, it's impossible to use an encoding different from system default encoding. This is critical in Japan, because 
> there is two commonly used encodings in Japanese environment(SJIS and EUC-JP).
> I think MXParser should handle encoding declaration in xml as described in w3c specification/
> http://www.w3.org/TR/REC-xml/#sec-guessing
> I tried to fix this problem(see attachment).
> I changed the setInput(InputStream) method to detect encoding in xml declaration.
> For writing this code, I referred to source code of Apache Xerces.
> UCS-4 and UCS-2 isn't supported in this implementation, because
> these encoding isn't supported by Sun JDK.
> Xerces solves this problem by providing original reader for these encodings. I think Xerces's solution is
> too complex for plexus-utils.
> To solve this issue, it's not sufficient only to change plexus-utils, because
> DefaultMavenProjectBuilder reads POM by FileReader without specifying encoding.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira