You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Gaurav Mittal (JIRA)" <ji...@apache.org> on 2018/11/14 05:03:01 UTC

[jira] [Created] (COMPRESS-471) Zipped files names having non UTF-8 encoding are being replaced with '?' while previewing file.

Gaurav Mittal created COMPRESS-471:
--------------------------------------

             Summary: Zipped files names having non UTF-8 encoding are being replaced with '?' while previewing file.
                 Key: COMPRESS-471
                 URL: https://issues.apache.org/jira/browse/COMPRESS-471
             Project: Commons Compress
          Issue Type: Bug
    Affects Versions: 1.18
            Reporter: Gaurav Mittal
         Attachments: Document(▒Γ║╗)_20150226_11.zip, Incorrect.JPG, correct.JPG

| * All the strings which are not supported by UTF-8 are being replaced by '?' symbol, 
In the issue scenario the charset is 'Cp850', Since the common compress library cannot identify the 'Cp850' charset and it takes the default charset as 'UTF-8' therefore
 we can see the '?' symbol

In our code 
ZipFile ret = new ZipFile(path);

Moreover if we send the encoding in the function as defined below, it works fine
ZipFile ret = new ZipFile(new File(path), "Cp850",false);

But the second scenario where we are forcibly giving the encoding as 'Cp850' may cause side effects in some cases


 --------------------------------------------------------------------------
Below code does not seem to resolve UTF8 conflicts and could not make file names into correct form -
 
try {
 final Map<ZipArchiveEntry, NameAndComment> entriesWithoutUTF8Flag =
 populateFromCentralDirectory();
 resolveLocalFileHeaderData(entriesWithoutUTF8Flag); 
 success = true;
} finally {
 closed = !success;
 if (!success && closeOnError) {
 IOUtils.closeQuietly(archive);
 }
}|
| |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)