You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@maven.apache.org by ji...@codehaus.org on 2003/09/25 16:26:10 UTC

[jira] Updated: (MAVEN-382) POM encoding problem

The following issue has been updated:

    Updater: Norbert Pabis (mailto:npabis@e-point.pl)
       Date: Thu, 25 Sep 2003 9:25 AM
    Comment:
This patch resolves problems with charactes in POM that are not from ISO-8859-1.

There are two issues:
1. In MavenUtils getProjectString was using always ISO-8859-1.
That could not work with characters outside this charset.
To have that fixed I needed original project.xml encoding. Unfortunately SAX which hides behind bewixt and digester never share
this information. So there are several possible workarounds:
- to have a veriable <pomEncoding>
- to have a property pom.encoding
- to read encoding from several first bytes of projex.xml "by hand"
- to decide that project.xml is always in UTF-8
I chose the last option.

2. In xdoc plugin tag <parse> from jelly-tags-xml is used. This tag uses dom4j which has a bug in SAXReader.parse(File).
Maven depends on dom4j 1.2.8, last version is 1.4 and this bug is already fixed in CVS but no newer version is available.
Now I could do two things:
- change Maven dependency to dom4j-snapshot (risky)
- use <xml:parse xml="URL"> instead of <xml:parse xml="File">
I chose the last option.

In addtionn I included tests that ensure that with changing dependencies encoding issues will not be broken.

This patch will probably fix  http://jira.codehaus.org/secure/ViewIssue.jspa?key=MAVEN-847 
too.
    Changes:
             Attachment changed to encoding_problems_patch.gz
    ---------------------------------------------------------------------
For a full history of the issue, see:

  http://jira.codehaus.org/secure/ViewIssue.jspa?key=MAVEN-382&page=history

---------------------------------------------------------------------
View the issue:

  http://jira.codehaus.org/secure/ViewIssue.jspa?key=MAVEN-382


Here is an overview of the issue:
---------------------------------------------------------------------
        Key: MAVEN-382
    Summary: POM encoding problem
       Type: Improvement

     Status: Unassigned
   Priority: Major

 Time Spent: Unknown
  Remaining: Unknown

    Project: maven
 Components: 
             core
   Fix Fors:
             1.1
   Versions:
             1.0-beta-9

   Assignee: 
   Reporter: Kuisong Tong

    Created: Wed, 9 Apr 2003 4:28 AM
    Updated: Thu, 25 Sep 2003 9:25 AM

Description:
I use chinese in my project, but when I use chinese in my report name,the site display abnormally. 

This patch let me display chinese in site generate from pom normal.

But I can't let the chinese normal if the chinese in project.xml


---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa

If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: [jira] Updated: (MAVEN-382) POM encoding problem

Posted by Norbert Pabiƛ <np...@e-point.pl>.
There is yet another possibility.
project.xml can be always in ISO-8859-1 then national characters can
be used in form &#xhhh; (hex) or &#nnn; (dec).
In this case internal encoding in MavenUtils.getProjectString must be 
set to
UTF-16.

What is preferred solution?
Please comment.

jira@codehaus.org wrote:
> The following issue has been updated:
> 
>     Updater: Norbert Pabis (mailto:npabis@e-point.pl)
>        Date: Thu, 25 Sep 2003 9:25 AM
>     Comment:
> This patch resolves problems with charactes in POM that are not from ISO-8859-1.
> 
> There are two issues:
> 1. In MavenUtils getProjectString was using always ISO-8859-1.
> That could not work with characters outside this charset.
> To have that fixed I needed original project.xml encoding. Unfortunately SAX which hides behind bewixt and digester never share
> this information. So there are several possible workarounds:
> - to have a veriable <pomEncoding>
> - to have a property pom.encoding
> - to read encoding from several first bytes of projex.xml "by hand"
> - to decide that project.xml is always in UTF-8
> I chose the last option.
> 
> 2. In xdoc plugin tag <parse> from jelly-tags-xml is used. This tag uses dom4j which has a bug in SAXReader.parse(File).
> Maven depends on dom4j 1.2.8, last version is 1.4 and this bug is already fixed in CVS but no newer version is available.
> Now I could do two things:
> - change Maven dependency to dom4j-snapshot (risky)
> - use <xml:parse xml="URL"> instead of <xml:parse xml="File">
> I chose the last option.
> 
> In addtionn I included tests that ensure that with changing dependencies encoding issues will not be broken.
> 
> This patch will probably fix  http://jira.codehaus.org/secure/ViewIssue.jspa?key=MAVEN-847 
> too.
>     Changes:
>              Attachment changed to encoding_problems_patch.gz
>     ---------------------------------------------------------------------
> For a full history of the issue, see:
> 
>   http://jira.codehaus.org/secure/ViewIssue.jspa?key=MAVEN-382&page=history

-- 
Norbert Pabi?

Nobody expects the Debian Inquisition!
Our two weapons are fear and surprise... and ruthless efficiency!


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org