You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Zhefu PENG <46...@qq.com> on 2018/07/22 07:18:31 UTC

Using snappy compresscodec in hive

Hi,


Here is a confusion I encounter these days: I don't install or build snappy on my hadoop cluster, but when I tested and compared about the compression ratio of Parquet and ORC storage format. During the test, I can set the way of compression for two storage format, for example, using "TBLPROPERTIES ("orc.compress"="Snappy"); " or "set parquet.compression=snappy;", both these commands would work. However, when I just want to compress the textfile format with snappy compression, it says that "can not find or access the snappy library".


I wonder why this situation happen, and, I really doubt that whether the ORC or Parquet file using "Snappy" compression. But, the storage really becomes smaller, and diff from the "gzip" or "zlib" compression.


Looking forward to your reply and help.


Best,
Zhefu Peng