You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by yazgoo <ya...@gmail.com> on 2014/11/25 18:58:25 UTC

logging agent based on fuse and kafka: first release

Hi,

First I'd like to thank kafka developers for writing kafka.

This is an announcement for the first release of a file system logging
agent based on kafka.

It is written for collecting logs from servers running all kind of software,
as a generic way to collect logs without needing to know about each logger.

Home:
https://github.com/yazgoo/fuse_kafka

Here are some functionnalities:

   - sends all writes to given directories to kafka
   - passes through FS syscalls to underlying directory
   - captures the pid, gid, uid, user, group, command line doing the write
   - you can add metadata to identify from where the message comes from
   (e.g. ip-address, ...)
   - you can configure kafka destination cluster either by giving a broker
   list or a zookeeper list
   - you can specify a bandwidth quota: fuse_kafka won't send data if a
   file is written more than a given size per second (useful for preventing
   floods caused by core files dumped or log rotations in directories watched
   by fuse_kafka)

It is based on:

   - FUSE (filesystem in userspace), to capture writes done under a given
   directory
   - kafka (messaging queue), as the event transport system
   - logstash: events are written to kafka in logstash format (except
   messages and commands which are stored in base64)

It is written in C and python.

Packages are provided for various distros, see installing section in
README.md.
FUSE adds an overhead, so it should not be used on filesystems where high
throughput is necessary.
Here are benchmarks:
http://htmlpreview.github.io/?https://raw.githubusercontent.com/yazgoo/fuse_kafka/master/benchs/benchmarks.html

Contributions are welcome, of course!

Regards

Re: logging agent based on fuse and kafka: first release

Posted by Neha Narkhede <ne...@gmail.com>.
Great. Thanks for sharing. I added it to our ecosystem
<https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem> wiki.

On Tue, Nov 25, 2014 at 9:58 AM, yazgoo <ya...@gmail.com> wrote:

> Hi,
>
> First I'd like to thank kafka developers for writing kafka.
>
> This is an announcement for the first release of a file system logging
> agent based on kafka.
>
> It is written for collecting logs from servers running all kind of
> software,
> as a generic way to collect logs without needing to know about each logger.
>
> Home:
> https://github.com/yazgoo/fuse_kafka
>
> Here are some functionnalities:
>
>    - sends all writes to given directories to kafka
>    - passes through FS syscalls to underlying directory
>    - captures the pid, gid, uid, user, group, command line doing the write
>    - you can add metadata to identify from where the message comes from
>    (e.g. ip-address, ...)
>    - you can configure kafka destination cluster either by giving a broker
>    list or a zookeeper list
>    - you can specify a bandwidth quota: fuse_kafka won't send data if a
>    file is written more than a given size per second (useful for preventing
>    floods caused by core files dumped or log rotations in directories
> watched
>    by fuse_kafka)
>
> It is based on:
>
>    - FUSE (filesystem in userspace), to capture writes done under a given
>    directory
>    - kafka (messaging queue), as the event transport system
>    - logstash: events are written to kafka in logstash format (except
>    messages and commands which are stored in base64)
>
> It is written in C and python.
>
> Packages are provided for various distros, see installing section in
> README.md.
> FUSE adds an overhead, so it should not be used on filesystems where high
> throughput is necessary.
> Here are benchmarks:
>
> http://htmlpreview.github.io/?https://raw.githubusercontent.com/yazgoo/fuse_kafka/master/benchs/benchmarks.html
>
> Contributions are welcome, of course!
>
> Regards
>