You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Israel Ekpo <is...@aicer.org> on 2014/05/02 23:18:23 UTC
Flume Jambalaya - A Flume Plugin with Multiple Components
Flume Community,
I created a Flume Plugin with multiple components that complements the
current version of Apache Flume.
This was necessary as part of a personal project as I working on.
It is code named - Flume Jambalaya
Jambalaya is a standalone Apache Flume plugin that contains a variety of
sources, interceptors, channels, sinks, serializers and other components
designed to extend the Flume architecture. It has been released under the
Apache License version 2.0
https://github.com/aicer/flume-jambalaya
It currently contains:
(a) File Source - This source lets you ingest data by tailing files from a
specific path
(b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
cluster via HTTP with no dependency on the ElasticSearch versions between
Flume and the Server cluster.
(c) DateInterceptor - The date interceptor is used for parsing dates from
fields and using that date or timestamp as the timestamp for the Flume
event.
(d) Grok Interceptor - allows you to extract structured data from
unstructured text and inject them as headers into the event
Sample configuration files are available here
https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
I did not realize that the Flume trunk already has a HTTP Sink for
ElasticSearch so you can decide whether or not to use the sink that comes
with it
I am still testing and integrating the various components.
Please check it out when you get a chance and send me some feedback
Thanks.
Re: Flume Jambalaya - A Flume Plugin with Multiple Components
Posted by Wolfgang Hoschek <wh...@cloudera.com>.
My sense is that a) is interesting if it evolves into a capable true native tailer, whereas b) is already available in flume and c) and d) are already available in flume via the MorphlineInterceptor
Wolfgang.
On May 3, 2014, at 12:18 AM, Israel Ekpo <is...@aicer.org> wrote:
> Flume Community,
>
> I created a Flume Plugin with multiple components that complements the current version of Apache Flume.
>
> This was necessary as part of a personal project as I working on.
>
> It is code named - Flume Jambalaya
>
> Jambalaya is a standalone Apache Flume plugin that contains a variety of sources, interceptors, channels, sinks, serializers and other components designed to extend the Flume architecture. It has been released under the Apache License version 2.0
>
> https://github.com/aicer/flume-jambalaya
>
> It currently contains:
>
> (a) File Source - This source lets you ingest data by tailing files from a specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch cluster via HTTP with no dependency on the ElasticSearch versions between Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from fields and using that date or timestamp as the timestamp for the Flume event.
> (d) Grok Interceptor - allows you to extract structured data from unstructured text and inject them as headers into the event
>
> Sample configuration files are available here
>
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>
> I did not realize that the Flume trunk already has a HTTP Sink for ElasticSearch so you can decide whether or not to use the sink that comes with it
>
> I am still testing and integrating the various components.
>
> Please check it out when you get a chance and send me some feedback
>
> Thanks.
>
Re: Flume Jambalaya - A Flume Plugin with Multiple Components
Posted by Wolfgang Hoschek <wh...@cloudera.com>.
My sense is that a) is interesting if it evolves into a capable true native tailer, whereas b) is already available in flume and c) and d) are already available in flume via the MorphlineInterceptor
Wolfgang.
On May 3, 2014, at 12:18 AM, Israel Ekpo <is...@aicer.org> wrote:
> Flume Community,
>
> I created a Flume Plugin with multiple components that complements the current version of Apache Flume.
>
> This was necessary as part of a personal project as I working on.
>
> It is code named - Flume Jambalaya
>
> Jambalaya is a standalone Apache Flume plugin that contains a variety of sources, interceptors, channels, sinks, serializers and other components designed to extend the Flume architecture. It has been released under the Apache License version 2.0
>
> https://github.com/aicer/flume-jambalaya
>
> It currently contains:
>
> (a) File Source - This source lets you ingest data by tailing files from a specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch cluster via HTTP with no dependency on the ElasticSearch versions between Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from fields and using that date or timestamp as the timestamp for the Flume event.
> (d) Grok Interceptor - allows you to extract structured data from unstructured text and inject them as headers into the event
>
> Sample configuration files are available here
>
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>
> I did not realize that the Flume trunk already has a HTTP Sink for ElasticSearch so you can decide whether or not to use the sink that comes with it
>
> I am still testing and integrating the various components.
>
> Please check it out when you get a chance and send me some feedback
>
> Thanks.
>
Re: Flume Jambalaya - A Flume Plugin with Multiple Components
Posted by Otis Gospodnetic <ot...@gmail.com>.
Had a quick look. Added my observations about FileSource stuff in
https://issues.apache.org/jira/browse/FLUME-2344
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Fri, May 2, 2014 at 8:21 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:
> What Wolfgang said :)
>
> I'd be interested in hearing how the File Source is different from or
> better than Exec Source with tail -F or
> https://issues.apache.org/jira/browse/FLUME-2344 - do you know?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:
>
>> Flume Community,
>>
>> I created a Flume Plugin with multiple components that complements the
>> current version of Apache Flume.
>>
>> This was necessary as part of a personal project as I working on.
>>
>> It is code named - Flume Jambalaya
>>
>> Jambalaya is a standalone Apache Flume plugin that contains a variety of
>> sources, interceptors, channels, sinks, serializers and other components
>> designed to extend the Flume architecture. It has been released under the
>> Apache License version 2.0
>>
>> https://github.com/aicer/flume-jambalaya
>>
>> It currently contains:
>>
>> (a) File Source - This source lets you ingest data by tailing files from
>> a specific path
>> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
>> cluster via HTTP with no dependency on the ElasticSearch versions between
>> Flume and the Server cluster.
>> (c) DateInterceptor - The date interceptor is used for parsing dates from
>> fields and using that date or timestamp as the timestamp for the Flume
>> event.
>> (d) Grok Interceptor - allows you to extract structured data from
>> unstructured text and inject them as headers into the event
>>
>> Sample configuration files are available here
>>
>>
>> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>>
>> I did not realize that the Flume trunk already has a HTTP Sink for
>> ElasticSearch so you can decide whether or not to use the sink that comes
>> with it
>>
>> I am still testing and integrating the various components.
>>
>> Please check it out when you get a chance and send me some feedback
>>
>> Thanks.
>>
>
>
Re: Flume Jambalaya - A Flume Plugin with Multiple Components
Posted by Otis Gospodnetic <ot...@gmail.com>.
Had a quick look. Added my observations about FileSource stuff in
https://issues.apache.org/jira/browse/FLUME-2344
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Fri, May 2, 2014 at 8:21 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:
> What Wolfgang said :)
>
> I'd be interested in hearing how the File Source is different from or
> better than Exec Source with tail -F or
> https://issues.apache.org/jira/browse/FLUME-2344 - do you know?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:
>
>> Flume Community,
>>
>> I created a Flume Plugin with multiple components that complements the
>> current version of Apache Flume.
>>
>> This was necessary as part of a personal project as I working on.
>>
>> It is code named - Flume Jambalaya
>>
>> Jambalaya is a standalone Apache Flume plugin that contains a variety of
>> sources, interceptors, channels, sinks, serializers and other components
>> designed to extend the Flume architecture. It has been released under the
>> Apache License version 2.0
>>
>> https://github.com/aicer/flume-jambalaya
>>
>> It currently contains:
>>
>> (a) File Source - This source lets you ingest data by tailing files from
>> a specific path
>> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
>> cluster via HTTP with no dependency on the ElasticSearch versions between
>> Flume and the Server cluster.
>> (c) DateInterceptor - The date interceptor is used for parsing dates from
>> fields and using that date or timestamp as the timestamp for the Flume
>> event.
>> (d) Grok Interceptor - allows you to extract structured data from
>> unstructured text and inject them as headers into the event
>>
>> Sample configuration files are available here
>>
>>
>> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>>
>> I did not realize that the Flume trunk already has a HTTP Sink for
>> ElasticSearch so you can decide whether or not to use the sink that comes
>> with it
>>
>> I am still testing and integrating the various components.
>>
>> Please check it out when you get a chance and send me some feedback
>>
>> Thanks.
>>
>
>
Re: Flume Jambalaya - A Flume Plugin with Multiple Components
Posted by Otis Gospodnetic <ot...@gmail.com>.
What Wolfgang said :)
I'd be interested in hearing how the File Source is different from or
better than Exec Source with tail -F or
https://issues.apache.org/jira/browse/FLUME-2344 - do you know?
Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:
> Flume Community,
>
> I created a Flume Plugin with multiple components that complements the
> current version of Apache Flume.
>
> This was necessary as part of a personal project as I working on.
>
> It is code named - Flume Jambalaya
>
> Jambalaya is a standalone Apache Flume plugin that contains a variety of
> sources, interceptors, channels, sinks, serializers and other components
> designed to extend the Flume architecture. It has been released under the
> Apache License version 2.0
>
> https://github.com/aicer/flume-jambalaya
>
> It currently contains:
>
> (a) File Source - This source lets you ingest data by tailing files from a
> specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
> cluster via HTTP with no dependency on the ElasticSearch versions between
> Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from
> fields and using that date or timestamp as the timestamp for the Flume
> event.
> (d) Grok Interceptor - allows you to extract structured data from
> unstructured text and inject them as headers into the event
>
> Sample configuration files are available here
>
>
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>
> I did not realize that the Flume trunk already has a HTTP Sink for
> ElasticSearch so you can decide whether or not to use the sink that comes
> with it
>
> I am still testing and integrating the various components.
>
> Please check it out when you get a chance and send me some feedback
>
> Thanks.
>
Re: Flume Jambalaya - A Flume Plugin with Multiple Components
Posted by Otis Gospodnetic <ot...@gmail.com>.
What Wolfgang said :)
I'd be interested in hearing how the File Source is different from or
better than Exec Source with tail -F or
https://issues.apache.org/jira/browse/FLUME-2344 - do you know?
Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
On Fri, May 2, 2014 at 5:18 PM, Israel Ekpo <is...@aicer.org> wrote:
> Flume Community,
>
> I created a Flume Plugin with multiple components that complements the
> current version of Apache Flume.
>
> This was necessary as part of a personal project as I working on.
>
> It is code named - Flume Jambalaya
>
> Jambalaya is a standalone Apache Flume plugin that contains a variety of
> sources, interceptors, channels, sinks, serializers and other components
> designed to extend the Flume architecture. It has been released under the
> Apache License version 2.0
>
> https://github.com/aicer/flume-jambalaya
>
> It currently contains:
>
> (a) File Source - This source lets you ingest data by tailing files from a
> specific path
> (b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
> cluster via HTTP with no dependency on the ElasticSearch versions between
> Flume and the Server cluster.
> (c) DateInterceptor - The date interceptor is used for parsing dates from
> fields and using that date or timestamp as the timestamp for the Flume
> event.
> (d) Grok Interceptor - allows you to extract structured data from
> unstructured text and inject them as headers into the event
>
> Sample configuration files are available here
>
>
> https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files
>
> I did not realize that the Flume trunk already has a HTTP Sink for
> ElasticSearch so you can decide whether or not to use the sink that comes
> with it
>
> I am still testing and integrating the various components.
>
> Please check it out when you get a chance and send me some feedback
>
> Thanks.
>