You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Jukka Harkki (JIRA)" <ji...@apache.org> on 2016/02/04 15:12:39 UTC
[jira] [Created] (MCHANGELOG-142) UTF-8 Encoding doubled
Jukka Harkki created MCHANGELOG-142:
---------------------------------------
Summary: UTF-8 Encoding doubled
Key: MCHANGELOG-142
URL: https://issues.apache.org/jira/browse/MCHANGELOG-142
Project: Maven Changelog Plugin
Issue Type: Bug
Affects Versions: 2.3
Reporter: Jukka Harkki
Creating changelog.xml file doubles UTF-8 encoding if the git comment information is already UTF-8 format. For example: if property outputEncoding is set to ISO-8859-1 the output is (od dump):
{code}
0004060 7375 7420 696f 696d 616d 6e61 6d20 c379
u s t o i m i m a a n m y ├
0004100 73b6 6c20 7369 a4c3 6b79 6573 7373 a4c3
 s l i s ├ ñ y k s e s s ├ ñ
{code}
And when set to UTF-8 the output is:
{code}
0004060 6d69 6d69 6161 206e 796d 83c3 b6c2 2073
i m i m a a n m y ├ â ┬ Â s
{code}
The result of UTF-8 encoding is that scandinavian umlauts are garbled. Code C3 B6 is the right for "ö"-letter.
The ISO-8859-1 format would do for the site documentation otherwise but since the file xml header says ISO-8859-1 encoding, rest of the process fails.
I modified class ChangeLogReport method writeChangelogXml() by commenting out issue MCHANGELOG-86 writer change:
{code}
PrintWriter pw = new PrintWriter(new BufferedOutputStream(new FileOutputStream(outputXML)));
pw.write(changelogXml.toString());
pw.flush();
pw.close();
// MCHANGELOG-86
// Writer writer = WriterFactory.newWriter( new BufferedOutputStream( new FileOutputStream( outputXML ) ),
// getOutputEncoding() );
// writer.write(changelogXml.toString());
// writer.flush();
// writer.close();
{code}
It might be there is double escaping in Writer since couple of lines above the change set is created with encoding information. However, this is just a wild guess since I did not check out implementation of changelogSet.toXML() or writer.write()
{code}
String changeset = changelogSet.toXML(getOutputEncoding());
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)