You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Wolfgang Glas <wo...@ev-i.at> on 2009/03/01 22:53:48 UTC
Re: [compress] State of encoding support in ZIP package
Stefan Bodewig schrieb:
> On 2009-02-27, Wolfgang Glas <wo...@ev-i.at> wrote:
>
>> Additionally, my experience with WinZip shows, that WinZip writes weird
>> filenames to the single-byte version of the filename when a unicode field is
>> present.
>
> Hmm, native encoding I'd guess.
Sth like this, looks like they are writing the LSB of a 2-byte value...
> Wolfgang, could you do me a favor and please review what I've written
> for the Ant zip task manual page in svn revision 748593
> <http://svn.apache.org/viewvc?view=rev&revision=748593>, in particular
> <http://svn.apache.org/viewvc/ant/core/trunk/docs/manual/CoreTasks/zip.html?r1=748593&r2=748592&pathrev=748593>?
Seems quite OK ;-)
The one thing, I'd like to discuss is the semantics of the useEFS flag in
ZipArchiveOutputStream:
My understanding from previous discussion was, that we need a mode, where file
names not encodable by the chosen encoding are encoded in UTF-8, which is in
turn indicated by setting the EFS flag on the likewise ZIP entry. (That's the
way 7-zip handles unicode filenames...)
The current implementation of the useEFS flag simply allocs to disable the
creation of the UFS flag in ZIP entries, which are UTF-8. This approach is not
conformant with the specifiations I've read and I have not seen a single zip
implementation, which is disturbed by the EFS flag.
My opinion would be to simply drop the possibility to inhibit the EFS flag in
utf-8 encoded files and to introduce a new flag allowing to switch to utf-8
fallbacks (7-zip mode...).
What other opinion are out there?
Wolfgang
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org
Re: [compress] State of encoding support in ZIP package
Posted by Wolfgang Glas <wo...@ev-i.at>.
Stefan Bodewig schrieb:
> On 2009-03-01, Wolfgang Glas <wo...@ev-i.at> wrote:
>
>> My understanding from previous discussion was, that we need a mode,
>> where file names not encodable by the chosen encoding are encoded in
>> UTF-8, which is in turn indicated by setting the EFS flag on the
>> likewise ZIP entry. (That's the way 7-zip handles unicode
>> filenames...)
>
> This is different from what we've currently implemented, but may stiil
> be useful.
>
>> The current implementation of the useEFS flag simply allocs to
>> disable the creation of the UFS flag in ZIP entries, which are
>> UTF-8. This approach is not conformant with the specifiations I've
>> read and I have not seen a single zip implementation, which is
>> disturbed by the EFS flag.
>
> But if there should be one - say zlib on z/OS or some other strange
> thing, it will be good to have that option available,
OK, agreed, let's keep this flag ;-)
>> My opinion would be to simply drop the possibility to inhibit the
>> EFS flag in utf-8 encoded files and to introduce a new flag allowing
>> to switch to utf-8 fallbacks (7-zip mode...).
>
> I'm fine with an additional flag that would encode not-encodable file
> names as UTF-8 (not sure about the name of the flag and I have a long
> standing history for chosing bad names), but prefer to keep the
> existing option for the completely orthogonal case of whether we set
> the EFS at all.
OK, I will introduce an additional flag, let's call it
'setFallbackToUtf8(boolean)'. I will prepare a patch right after you've review
and (possibly) committed my latest encoding refatoring patch.
Best regards,
Wolfgang
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org
Re: [compress] State of encoding support in ZIP package
Posted by Stefan Bodewig <bo...@apache.org>.
On 2009-03-01, Wolfgang Glas <wo...@ev-i.at> wrote:
> My understanding from previous discussion was, that we need a mode,
> where file names not encodable by the chosen encoding are encoded in
> UTF-8, which is in turn indicated by setting the EFS flag on the
> likewise ZIP entry. (That's the way 7-zip handles unicode
> filenames...)
This is different from what we've currently implemented, but may stiil
be useful.
> The current implementation of the useEFS flag simply allocs to
> disable the creation of the UFS flag in ZIP entries, which are
> UTF-8. This approach is not conformant with the specifiations I've
> read and I have not seen a single zip implementation, which is
> disturbed by the EFS flag.
But if there should be one - say zlib on z/OS or some other strange
thing, it will be good to have that option available,
> My opinion would be to simply drop the possibility to inhibit the
> EFS flag in utf-8 encoded files and to introduce a new flag allowing
> to switch to utf-8 fallbacks (7-zip mode...).
I'm fine with an additional flag that would encode not-encodable file
names as UTF-8 (not sure about the name of the flag and I have a long
standing history for chosing bad names), but prefer to keep the
existing option for the completely orthogonal case of whether we set
the EFS at all.
Stefan
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org