You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Eric Payne (Jira)" <ji...@apache.org> on 2020/11/17 21:11:00 UTC

[jira] [Created] (HADOOP-17381) Ability to specify separate compression settings when intermediate and final output use the same codec

Eric Payne created HADOOP-17381:
-----------------------------------

             Summary: Ability to specify separate compression settings when intermediate and final output use the same codec
                 Key: HADOOP-17381
                 URL: https://issues.apache.org/jira/browse/HADOOP-17381
             Project: Hadoop Common
          Issue Type: Improvement
            Reporter: Eric Payne


The ZStandard codec may become a codec that users will want to use for both intermediate data and for final output data yet specify different compression levels for those use cases.

It would be nice if there was a way we could create a "meta codec" like IntermediateCodec that used conf prefix techniques, like Oozie does with oozie.launcher for the Oozie launcher configs, to create a custom config namespace of sorts for setting arbitrary codec settings specific to the intermediate codec separate from the final output codec even if the same underlying codec is used for both.

However Codecs don't allow a configuration to be passed when obtaining a codec stream, and I think we would have to bypass the CodecPool entirely to be able to pass a custom conf to an arbitrary Codec.

Another approach is to skip trying to generalize the solution and specifically focus on ZStandard. It would be easy to create a wrapper codec around the existing ZStandardCompressor and ZStandardDecompressor which take the relevant parameters directly in their constructors.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org