You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Igloo (Jira)" <ji...@apache.org> on 2020/06/30 03:42:00 UTC

[jira] [Created] (HDFS-15445) ZStandardCodec compression mail fail when encounter specific file

Igloo created HDFS-15445:
----------------------------

             Summary: ZStandardCodec compression mail fail when encounter specific file
                 Key: HDFS-15445
                 URL: https://issues.apache.org/jira/browse/HDFS-15445
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs
    Affects Versions: 2.6.5
         Environment: zstd 1.3.3

hadoop 2.6.5 

 

--- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java
+++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java
@@ -62,10 +62,8 @@
 @BeforeClass
 public static void beforeClass() throws Exception {
 CONFIGURATION.setInt(IO_FILE_BUFFER_SIZE_KEY, 1024 * 64);
- uncompressedFile = new File(TestZStandardCompressorDecompressor.class
- .getResource("/zstd/test_file.txt").toURI());
- compressedFile = new File(TestZStandardCompressorDecompressor.class
- .getResource("/zstd/test_file.txt.zst").toURI());
+ uncompressedFile = new File("/tmp/badcase.data");
+ compressedFile = new File("/tmp/badcase.data.zst");
            Reporter: Igloo
         Attachments: badcase.data, image-2020-06-30-11-35-46-859.png, image-2020-06-30-11-39-17-861.png

*Problem:* 

In our production environment,  we put file in hdfs with zstd compressor, recently, we find that a specific file may leads to zstandard compressor failures. 

And we can reproduce the issue with specific file(attached file: badcase.data)

 

*Analysis*: 

ZStandarCompressor use buffersize( From zstd recommended compress out buffer size)  for both inBufferSize and outBufferSize 

!image-2020-06-30-11-35-46-859.png|width=475,height=179!

but zstd indeed provides two separately recommending inputBufferSize and outputBufferSize  

!image-2020-06-30-11-39-17-861.png!

 

*Workaround*

One workaround,  use recommended in/out buffer size provided by zstd lib.  

input buffer size:  1301072 (128 * 1024)

ouput buffer size: 131591 

 

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org