You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flume.apache.org by Israel Ekpo <is...@aicer.org> on 2014/05/02 23:18:23 UTC

Flume Jambalaya - A Flume Plugin with Multiple Components

Flume Community,

I created a Flume Plugin with multiple components that complements the
current version of Apache Flume.

This was necessary as part of a personal project as I working on.

It is code named - Flume Jambalaya

Jambalaya is a standalone Apache Flume plugin that contains a variety of
sources, interceptors, channels, sinks, serializers and other components
designed to extend the Flume architecture. It has been released under the
Apache License version 2.0

https://github.com/aicer/flume-jambalaya

It currently contains:

(a) File Source - This source lets you ingest data by tailing files from a
specific path
(b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
cluster via HTTP with no dependency on the ElasticSearch versions between
Flume and the Server cluster.
(c) DateInterceptor - The date interceptor is used for parsing dates from
fields and using that date or timestamp as the timestamp for the Flume
event.
(d) Grok Interceptor - allows you to extract structured data from
unstructured text and inject them as headers into the event

Sample configuration files are available here

https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files

I did not realize that the Flume trunk already has a HTTP Sink for
ElasticSearch so you can decide whether or not to use the sink that comes
with it

I am still testing and integrating the various components.

Please check it out when you get a chance and send me some feedback

Thanks.

Re: Flume Jambalaya - A Flume Plugin with Multiple Components

Posted by Wolfgang Hoschek <wh...@cloudera.com>.

My sense is that a) is interesting if it evolves into a capable true native tailer, whereas b) is already available in flume and c) and d) are already available in flume via the MorphlineInterceptor

Wolfgang.

On May 3, 2014, at 12:18 AM, Israel Ekpo <is...@aicer.org> wrote:

> Flume Community,
> 
> I created a Flume Plugin with multiple components that complements the current version of Apache Flume.
> 
> This was necessary as part of a personal project as I working on.
> 
> It is code named - Flume Jambalaya
> 
> Jambalaya is a standalone Apache Flume plugin that contains a variety of sources, interceptors, channels, sinks, serializers and other components designed to extend the Flume architecture. It has been released under the Apache License version 2.0
> 
> https://github.com/aicer/flume-jambalaya
> 
> It currently contains:
> 
> (a) File Source - This source lets you ingest data by tailing files from a specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch cluster via HTTP with no dependency on the ElasticSearch versions between Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from fields and using that date or timestamp as the timestamp for the Flume event.
> (d) Grok Interceptor - allows you to extract structured data from unstructured text and inject them as headers into the event
> 
> Sample configuration files are available here
> 
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
> 
> I did not realize that the Flume trunk already has a HTTP Sink for ElasticSearch so you can decide whether or not to use the sink that comes with it
> 
> I am still testing and integrating the various components.
> 
> Please check it out when you get a chance and send me some feedback
> 
> Thanks.
>

Re: Flume Jambalaya - A Flume Plugin with Multiple Components

Posted by Wolfgang Hoschek <wh...@cloudera.com>.

My sense is that a) is interesting if it evolves into a capable true native tailer, whereas b) is already available in flume and c) and d) are already available in flume via the MorphlineInterceptor

Wolfgang.

On May 3, 2014, at 12:18 AM, Israel Ekpo <is...@aicer.org> wrote:

> Flume Community,
> 
> I created a Flume Plugin with multiple components that complements the current version of Apache Flume.
> 
> This was necessary as part of a personal project as I working on.
> 
> It is code named - Flume Jambalaya
> 
> Jambalaya is a standalone Apache Flume plugin that contains a variety of sources, interceptors, channels, sinks, serializers and other components designed to extend the Flume architecture. It has been released under the Apache License version 2.0
> 
> https://github.com/aicer/flume-jambalaya
> 
> It currently contains:
> 
> (a) File Source - This source lets you ingest data by tailing files from a specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch cluster via HTTP with no dependency on the ElasticSearch versions between Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from fields and using that date or timestamp as the timestamp for the Flume event.
> (d) Grok Interceptor - allows you to extract structured data from unstructured text and inject them as headers into the event
> 
> Sample configuration files are available here
> 
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
> 
> I did not realize that the Flume trunk already has a HTTP Sink for ElasticSearch so you can decide whether or not to use the sink that comes with it
> 
> I am still testing and integrating the various components.
> 
> Please check it out when you get a chance and send me some feedback
> 
> Thanks.
>

Re: Flume Jambalaya - A Flume Plugin with Multiple Components

Posted by Otis Gospodnetic <ot...@gmail.com>.

Had a quick look.  Added my observations about FileSource stuff in
https://issues.apache.org/jira/browse/FLUME-2344

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Fri, May 2, 2014 at 8:21 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:

> What Wolfgang said :)
>
> I'd be interested in hearing how the File Source is different from or
> better than Exec Source with tail -F or
> https://issues.apache.org/jira/browse/FLUME-2344 - do you know?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:
>
>> Flume Community,
>>
>> I created a Flume Plugin with multiple components that complements the
>> current version of Apache Flume.
>>
>> This was necessary as part of a personal project as I working on.
>>
>> It is code named - Flume Jambalaya
>>
>> Jambalaya is a standalone Apache Flume plugin that contains a variety of
>> sources, interceptors, channels, sinks, serializers and other components
>> designed to extend the Flume architecture. It has been released under the
>> Apache License version 2.0
>>
>> https://github.com/aicer/flume-jambalaya
>>
>> It currently contains:
>>
>> (a) File Source - This source lets you ingest data by tailing files from
>> a specific path
>> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
>> cluster via HTTP with no dependency on the ElasticSearch versions between
>> Flume and the Server cluster.
>> (c) DateInterceptor - The date interceptor is used for parsing dates from
>> fields and using that date or timestamp as the timestamp for the Flume
>> event.
>> (d) Grok Interceptor - allows you to extract structured data from
>> unstructured text and inject them as headers into the event
>>
>> Sample configuration files are available here
>>
>>
>> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>>
>> I did not realize that the Flume trunk already has a HTTP Sink for
>> ElasticSearch so you can decide whether or not to use the sink that comes
>> with it
>>
>> I am still testing and integrating the various components.
>>
>> Please check it out when you get a chance and send me some feedback
>>
>> Thanks.
>>
>
>

Re: Flume Jambalaya - A Flume Plugin with Multiple Components

Posted by Otis Gospodnetic <ot...@gmail.com>.

Had a quick look.  Added my observations about FileSource stuff in
https://issues.apache.org/jira/browse/FLUME-2344

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Fri, May 2, 2014 at 8:21 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:

> What Wolfgang said :)
>
> I'd be interested in hearing how the File Source is different from or
> better than Exec Source with tail -F or
> https://issues.apache.org/jira/browse/FLUME-2344 - do you know?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:
>
>> Flume Community,
>>
>> I created a Flume Plugin with multiple components that complements the
>> current version of Apache Flume.
>>
>> This was necessary as part of a personal project as I working on.
>>
>> It is code named - Flume Jambalaya
>>
>> Jambalaya is a standalone Apache Flume plugin that contains a variety of
>> sources, interceptors, channels, sinks, serializers and other components
>> designed to extend the Flume architecture. It has been released under the
>> Apache License version 2.0
>>
>> https://github.com/aicer/flume-jambalaya
>>
>> It currently contains:
>>
>> (a) File Source - This source lets you ingest data by tailing files from
>> a specific path
>> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
>> cluster via HTTP with no dependency on the ElasticSearch versions between
>> Flume and the Server cluster.
>> (c) DateInterceptor - The date interceptor is used for parsing dates from
>> fields and using that date or timestamp as the timestamp for the Flume
>> event.
>> (d) Grok Interceptor - allows you to extract structured data from
>> unstructured text and inject them as headers into the event
>>
>> Sample configuration files are available here
>>
>>
>> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>>
>> I did not realize that the Flume trunk already has a HTTP Sink for
>> ElasticSearch so you can decide whether or not to use the sink that comes
>> with it
>>
>> I am still testing and integrating the various components.
>>
>> Please check it out when you get a chance and send me some feedback
>>
>> Thanks.
>>
>
>

Re: Flume Jambalaya - A Flume Plugin with Multiple Components

Posted by Otis Gospodnetic <ot...@gmail.com>.

What Wolfgang said :)

I'd be interested in hearing how the File Source is different from or
better than Exec Source with tail -F or
https://issues.apache.org/jira/browse/FLUME-2344 - do you know?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:

> Flume Community,
>
> I created a Flume Plugin with multiple components that complements the
> current version of Apache Flume.
>
> This was necessary as part of a personal project as I working on.
>
> It is code named - Flume Jambalaya
>
> Jambalaya is a standalone Apache Flume plugin that contains a variety of
> sources, interceptors, channels, sinks, serializers and other components
> designed to extend the Flume architecture. It has been released under the
> Apache License version 2.0
>
> https://github.com/aicer/flume-jambalaya
>
> It currently contains:
>
> (a) File Source - This source lets you ingest data by tailing files from a
> specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
> cluster via HTTP with no dependency on the ElasticSearch versions between
> Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from
> fields and using that date or timestamp as the timestamp for the Flume
> event.
> (d) Grok Interceptor - allows you to extract structured data from
> unstructured text and inject them as headers into the event
>
> Sample configuration files are available here
>
>
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>
> I did not realize that the Flume trunk already has a HTTP Sink for
> ElasticSearch so you can decide whether or not to use the sink that comes
> with it
>
> I am still testing and integrating the various components.
>
> Please check it out when you get a chance and send me some feedback
>
> Thanks.
>

Re: Flume Jambalaya - A Flume Plugin with Multiple Components

Posted by Otis Gospodnetic <ot...@gmail.com>.

What Wolfgang said :)

I'd be interested in hearing how the File Source is different from or
better than Exec Source with tail -F or
https://issues.apache.org/jira/browse/FLUME-2344 - do you know?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:

> Flume Community,
>
> I created a Flume Plugin with multiple components that complements the
> current version of Apache Flume.
>
> This was necessary as part of a personal project as I working on.
>
> It is code named - Flume Jambalaya
>
> Jambalaya is a standalone Apache Flume plugin that contains a variety of
> sources, interceptors, channels, sinks, serializers and other components
> designed to extend the Flume architecture. It has been released under the
> Apache License version 2.0
>
> https://github.com/aicer/flume-jambalaya
>
> It currently contains:
>
> (a) File Source - This source lets you ingest data by tailing files from a
> specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
> cluster via HTTP with no dependency on the ElasticSearch versions between
> Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from
> fields and using that date or timestamp as the timestamp for the Flume
> event.
> (d) Grok Interceptor - allows you to extract structured data from
> unstructured text and inject them as headers into the event
>
> Sample configuration files are available here
>
>
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>
> I did not realize that the Flume trunk already has a HTTP Sink for
> ElasticSearch so you can decide whether or not to use the sink that comes
> with it
>
> I am still testing and integrating the various components.
>
> Please check it out when you get a chance and send me some feedback
>
> Thanks.
>