You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ted Yu <yu...@gmail.com> on 2010/01/18 21:53:41 UTC

multiple compression codecs

Hi,
mapred.output.compress is set to true in hadoop-site.xml
My question is how can I specify different compression codecs
programmatically ?

For example, normally the output is gzip compressed. But a small portion of
output needs to be LZO compressed.

Thanks

Re: multiple compression codecs

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, Jan 18, 2010 at 1:08 PM, Alex Kozlov <al...@cloudera.com> wrote:

> You can specify the compression/codec in the file writer (is this what you
> are asking?).
>
>
> SequenceFile.createWriter(fs, conf, path, key.getClass(), value.getClass(),
> SequenceFile.CompressionType.BLOCK, codec);
>
> You can also create your own FileOutputFormat.
>
>
If you're using one of the built in FileOutputFormat subclasses, you can
use FileOutputFormat.setOutputCompressorClass to set the codec. If you're
using MultipleOutputs, you will probably have to override some functions in
order to create different RecordWriters with different codecs, as Alex
suggested above.

-Todd




> On Mon, Jan 18, 2010 at 12:53 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Hi,
> > mapred.output.compress is set to true in hadoop-site.xml
> > My question is how can I specify different compression codecs
> > programmatically ?
> >
> > For example, normally the output is gzip compressed. But a small portion
> of
> > output needs to be LZO compressed.
> >
> > Thanks
> >
>

Re: multiple compression codecs

Posted by Alex Kozlov <al...@cloudera.com>.
You can specify the compression/codec in the file writer (is this what you
are asking?).


SequenceFile.createWriter(fs, conf, path, key.getClass(), value.getClass(),
SequenceFile.CompressionType.BLOCK, codec);

You can also create your own FileOutputFormat.

On Mon, Jan 18, 2010 at 12:53 PM, Ted Yu <yu...@gmail.com> wrote:

> Hi,
> mapred.output.compress is set to true in hadoop-site.xml
> My question is how can I specify different compression codecs
> programmatically ?
>
> For example, normally the output is gzip compressed. But a small portion of
> output needs to be LZO compressed.
>
> Thanks
>