You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@commons.apache.org by "Jens Reimann (JIRA)" <ji...@apache.org> on 2018/07/09 08:07:00 UTC

[jira] [Created] (COMPRESS-459) CPIO fails decoding multibyte name entries

Jens Reimann created COMPRESS-459:
-------------------------------------

             Summary: CPIO fails decoding multibyte name entries
                 Key: COMPRESS-459
                 URL: https://issues.apache.org/jira/browse/COMPRESS-459
             Project: Commons Compress
          Issue Type: Bug
          Components: Compressors
    Affects Versions: 1.17, 1.9
            Reporter: Jens Reimann


Having a CPIO archive in (e.g. UTF-8) mode and having a name entry with a name containing multi-byte characters the decoder fails.

The problem IMHO is the "getHeaderPadCount" method, which assumes a single byte per character:

 
{code:java}
    public int getHeaderPadCount(){
        if (this.alignmentBoundary == 0) { return 0; }
        int size = this.headerSize + 1;  // Name has terminating null
        if (name != null) {
            size += name.length();
        }
        final int remain = size % this.alignmentBoundary;
        if (remain > 0){
            return this.alignmentBoundary - remain;
        }
        return 0;
    }
{code}
However this may (or may not) be true for UTF-8.

 

Also it wouldn't be enough to call "String#getBytes(…)" as this might already transform the underlying bytes.

The proper solution would be to provide the name size, as read from the CPIO stream, and pass it to the entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)