You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Kristoffer Sjögren <st...@gmail.com> on 2014/08/23 10:51:25 UTC

Parquet buffering capability

Hi

Does flume have support for buffering/staging avro events locally on disk
and storing them in hdfs as parquet files?

Cloudera CDK explains [1] how to do this method manually but ideally I want
this process directly integrated into the flume runtime.

Cheers,
-Kristoffer

1. https://github.com/cloudera/cdk-examples/tree/master/dataset-staging

Re: Parquet buffering capability

Posted by Hari Shreedharan <hs...@cloudera.com>.
Missed the link:
https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLUME-2439

On Saturday, August 23, 2014, Hari Shreedharan <hs...@cloudera.com>
wrote:

> This was added recently and has not yet made into a release:
>
> On Saturday, August 23, 2014, Kristoffer Sjögren <stoffe@gmail.com
> <javascript:_e(%7B%7D,'cvml','stoffe@gmail.com');>> wrote:
>
>> Hi
>>
>> Does flume have support for buffering/staging avro events locally on disk
>> and storing them in hdfs as parquet files?
>>
>> Cloudera CDK explains [1] how to do this method manually but ideally I
>> want this process directly integrated into the flume runtime.
>>
>> Cheers,
>> -Kristoffer
>>
>> 1. https://github.com/cloudera/cdk-examples/tree/master/dataset-staging
>>
>

Re: Parquet buffering capability

Posted by Hari Shreedharan <hs...@cloudera.com>.
This was added recently and has not yet made into a release:

On Saturday, August 23, 2014, Kristoffer Sjögren <st...@gmail.com> wrote:

> Hi
>
> Does flume have support for buffering/staging avro events locally on disk
> and storing them in hdfs as parquet files?
>
> Cloudera CDK explains [1] how to do this method manually but ideally I
> want this process directly integrated into the flume runtime.
>
> Cheers,
> -Kristoffer
>
> 1. https://github.com/cloudera/cdk-examples/tree/master/dataset-staging
>