You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Tim Broberg <Ti...@exar.com> on 2011/12/13 18:57:26 UTC

Compression configuration peculiarities

I'm running into some head-scratchers in the are of compression configuration, and I'm wondering if I can get a little input on why these are the way they are and perhaps suggestions on how to handle this.

1 - In a patch related to https://issues.apache.org/jira/browse/HADOOP-5879, strategy and level flags got moved from io.compress.* to zlib.compress.*.
   1a - level applies broadly to any compressor
   1b - doesn't strategy apply equally to the built in deflate implementation as zlib?
2 - The Compressor has a Configuration object in its reinit() method, but nowhere else. (This also seems to be related to HADOOP-5879.) If you can reinit the configuration, shouldn't you be able to init it? This just smells like we didn't quite get the boundaries right.

Questions to consider:
 1 - Should Compressors have a constructor that takes a Configuration object in the base class?
 2 - If not, should reinit() take a Configuration object?
 3 - If not, where should reinit's functionality be?
 4 - Are there other items of configuration besides level that are generic?
 5 - If not, should there be a constructor for Compressor that takes compression level as an input with a default value?

The information and any attached documents contained in this message
may be confidential and/or legally privileged.  The message is
intended solely for the addressee(s).  If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful.  If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.