You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Tarandeep Singh <ta...@gmail.com> on 2016/04/18 12:56:11 UTC

Compression - AvroOutputFormat and over network ?

Hi,

How can I set compression for AvroOutputFormat when writing files on HDFS?
Also, can we set compression for intermediate data that is sent over
network (from map to reduce phase) ?

Thanks,
Tarandeep

Re: Compression - AvroOutputFormat and over network ?

Posted by Tarandeep Singh <ta...@gmail.com>.
Avro changes look easy. I think I can make those changes.
To make changes to network data, I need some directions.

@Ufuk please point me to corresponding code.

thanks,
Tarandeep

On Mon, Apr 18, 2016 at 11:05 AM, Robert Metzger <rm...@apache.org>
wrote:

> Hi Tarandeep,
>
> I think for that you would need to set a codec factory on the
> DataFileWriter. Sadly we don't expose that method to the user.
>
> If you want, you can contribute this change to Flink. Otherwise, I can
> quickly fix it.
>
> Regards,
> Robert
>
>
> On Mon, Apr 18, 2016 at 2:36 PM, Ufuk Celebi <uc...@apache.org> wrote:
>
>> Hey Tarandeep,
>>
>> regarding the network part: not possible at the moment. It's pretty
>> straight forward to add support for it, but no one ever got around to
>> actually implementing it. If you would like to contribute, I am happy
>> to give some hints about which parts of the system would need to be
>> modified.
>>
>> – Ufuk
>>
>>
>> On Mon, Apr 18, 2016 at 12:56 PM, Tarandeep Singh <ta...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > How can I set compression for AvroOutputFormat when writing files on
>> HDFS?
>> > Also, can we set compression for intermediate data that is sent over
>> network
>> > (from map to reduce phase) ?
>> >
>> > Thanks,
>> > Tarandeep
>>
>
>

Re: Compression - AvroOutputFormat and over network ?

Posted by Robert Metzger <rm...@apache.org>.
Hi Tarandeep,

I think for that you would need to set a codec factory on the DataFileWriter.
Sadly we don't expose that method to the user.

If you want, you can contribute this change to Flink. Otherwise, I can
quickly fix it.

Regards,
Robert


On Mon, Apr 18, 2016 at 2:36 PM, Ufuk Celebi <uc...@apache.org> wrote:

> Hey Tarandeep,
>
> regarding the network part: not possible at the moment. It's pretty
> straight forward to add support for it, but no one ever got around to
> actually implementing it. If you would like to contribute, I am happy
> to give some hints about which parts of the system would need to be
> modified.
>
> – Ufuk
>
>
> On Mon, Apr 18, 2016 at 12:56 PM, Tarandeep Singh <ta...@gmail.com>
> wrote:
> > Hi,
> >
> > How can I set compression for AvroOutputFormat when writing files on
> HDFS?
> > Also, can we set compression for intermediate data that is sent over
> network
> > (from map to reduce phase) ?
> >
> > Thanks,
> > Tarandeep
>

Re: Compression - AvroOutputFormat and over network ?

Posted by Ufuk Celebi <uc...@apache.org>.
Hey Tarandeep,

regarding the network part: not possible at the moment. It's pretty
straight forward to add support for it, but no one ever got around to
actually implementing it. If you would like to contribute, I am happy
to give some hints about which parts of the system would need to be
modified.

– Ufuk


On Mon, Apr 18, 2016 at 12:56 PM, Tarandeep Singh <ta...@gmail.com> wrote:
> Hi,
>
> How can I set compression for AvroOutputFormat when writing files on HDFS?
> Also, can we set compression for intermediate data that is sent over network
> (from map to reduce phase) ?
>
> Thanks,
> Tarandeep