You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Anvesh Mora (Jira)" <ji...@apache.org> on 2020/01/03 09:58:00 UTC

[jira] [Commented] (COMPRESS-500) Discrepancy in file size extracted using ZipArchieveInputStream and Gzip decompress component

    [ https://issues.apache.org/jira/browse/COMPRESS-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007370#comment-17007370 ] 

Anvesh Mora commented on COMPRESS-500:
--------------------------------------

So [~bodewig] , We can't just use ZipFile because the data is being pulled from a endpoint as a stream and should handled as a stream.  We should achieve with ZipInputStream from java.util.zip or with ZipArchiveInputStream from common-compress.

And I'm trying the combination to help us to narrow down. 

Mean while using special jar to get logs but not seeing any logs from the compress package classes. Can you help me out why this is happening? (As suggested in COMPRESS-494)

> Discrepancy in file size extracted using ZipArchieveInputStream and Gzip decompress component 
> ----------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-500
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-500
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.8, 1.18
>            Reporter: Anvesh Mora
>            Priority: Major
>
> Recent time I raised a bug facing a issue of "invalid Entry Size"  COMPRESS-494 ( Not resolved yet).
>  
> And we are seeing a new issue, before explaining we have a file structure as below and it is received as a stream of data over HTTPS.
>  
> *File Structure*:
> In Zip file
>      We have zero or more gz files which need to decompressed
>      And meta data at the end of the zip entries (end of stream), used for downloading next file zip file. As plain text.
>  
> And Now in production we are seeing new issue where we the entire gz file is not decompressing. We found out that the utility on Cent OS7 is able to extract and decompress the entire where as our library is failing. Below are the differences in Sizes:
> Using API: *765460480* bytes
> And using Cent OS7 Linux utilities: *2032925215* bytes.
>  
> We are getting EOF File exception at GzipCompressorInputStream.java:278, I'm not sure of why.
>  
> Need you help on this as we are blocked in the production. This could be a potential fix for our library to make it more robust.
>  
> Let me know HOW CAN WE INCREASE THE PRIORITY IF NEEDED!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)