You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Anders Hammar (JIRA)" <ji...@apache.org> on 2016/05/11 07:56:13 UTC

[jira] [Commented] (MRESOURCES-171) ISO8859-1 properties files get changed into UTF-8 when filtered

    [ https://issues.apache.org/jira/browse/MRESOURCES-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279740#comment-15279740 ] 

Anders Hammar commented on MRESOURCES-171:
------------------------------------------

JEP-226 seems to be part of Java SE 9. It changes the default to UTF-8, but tries reading as ISO-8859-1 if an error.
{quote}
Note: PropertyResourceBundle can be constructed either from an InputStream or a Reader, which represents a property file. Constructing a PropertyResourceBundle instance from an InputStream requires that the input stream be encoded in UTF-8. By default, if a MalformedInputException or an UnmappableCharacterException occurs on reading the input stream, then the PropertyResourceBundle instance resets to the state before the exception, re-reads the input stream in ISO-8859-1, and continues reading. If the system property java.util.PropertyResourceBundle.encoding is set to either "ISO-8859-1" or "UTF-8", the input stream is solely read in that encoding, and throws the exception if it encounters an invalid sequence. If "ISO-8859-1" is specified, characters that cannot be represented in ISO-8859-1 encoding must be represented by Unicode Escapes as defined in section 3.3 of The Java™ Language Specification whereas the other constructor which takes a Reader does not have that limitation. Other encoding values are ignored for this system property.
{quote}

Still, I see a value in the resources plugin having support for handling specific resource files differently than the specified sourceEncoding. Currently we just handle the case where everything uses the same encoding (which is of course the best), but that might not be the case. For example, xml files specify the encoding themselves which should be honored.

> ISO8859-1 properties files get changed into UTF-8 when filtered
> ---------------------------------------------------------------
>
>                 Key: MRESOURCES-171
>                 URL: https://issues.apache.org/jira/browse/MRESOURCES-171
>             Project: Maven Resources Plugin
>          Issue Type: Bug
>          Components: filtering
>            Reporter: Alex Collins
>            Priority: Minor
>         Attachments: filtering-bug.zip
>
>
> Create:
> src/main/resources/test.properties
> And add a ISO8859-1 character that is not ASCII or UTF-8, do not use \uXXXX formatting.
> When adding this line:
> <resource><directory>src/main/resources</directory><filtering>true</filtering></resource>
> Expected:
> ISO8859-1 encoded file in jar.
> Actual:
> UTF-8 encoded file in jar.
> ---
> If there are any property files (which can only be ISO8859-1) they appear to be converted into UTF-8 in the jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)