You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ant.apache.org by bu...@apache.org on 2005/08/21 12:48:39 UTC
DO NOT REPLY [Bug 36290] New: -
mutilates LATIN1 text files
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=36290>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=36290
Summary: <copy filtering="on"> mutilates LATIN1 text files
Product: Ant
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Core tasks
AssignedTo: dev@ant.apache.org
ReportedBy: anathaniel@apache.org
It is well documented that filtering may corrupt binary files. I now found
that a similar issue exists with text files.
When a text file with LATIN1 encoding is read assuming UTF-8 encoding, then a-
umlaut and other non-ASCII characters are replaced by '?' because these LATIN1
byte values are not valid UTF-8 sequences.
Now this is what happens if this task
<copy filtering="on" todir="bar">
<fileset dir="foo">
<include name="**/*.xml"/>
</fileset>
</copy>
is applied to XML files with encoding="iso-8859-1" on a platform with UTF-8 as
default encoding.
The easy workaround is to set explicitly <copy filtering="on" todir="bar"
encoding="iso-8859-1">. This also copies correctly UTF-8 encoded files
containing multi-byte character sequences. Token replacement of ASCII strings
also works correctly independent of the encoding.
My proposal is now to make this the default behaviour for the <copy> task:
If no explicit encoding is specified, do not use the platform dependent default
encoding (which may be UTF-8) but always use iso-8859-1.
--
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org