You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Gumnaam Sur <gu...@gmail.com> on 2012/07/20 20:29:46 UTC

Details about Syslog format

I have flume-ng agent 1.2 running on a Centos 6.x box.
The box's rsyslogd daemon, has been set up to forward all incoming syslog
messages
(via local unix socket, or over tcp+udp 514 port from remote), to the
flume-ng agent's
SyslogUDP source.

There are 2 sinks a file_roller and an hdfs sink.

A typical entry in /var/log/message is
"Jul 17 21:23:24 lm-collector yum: Installed: nc-1.84-22.el6.x86_64"

But when this gets logged in to the file or hdfs by flume-ng , I only see
" yum: Installed: nc-1.84-22.el6.x86_64" ,i.e. I am missing the time and
the host info.

I tried changing the serializer from 'text' to 'avro_event', but this
results in the java avro event object
being serialized (along with control characters etc), which is not what I
want.

I could use the 'SyslogAvroEventSerializer' in the test directory of
flume-ng-client, but that too would
lead to Serializing of the java object, which is not what I want.

So I ended up writing my own serializer, which simply extracts the message
+ syslog headers (timestamp, host,
facility, severity) from the event, and write it out in a single line.

My question is , is there any better alternative, than writing your own
serializer, to get the full syslog messages
logged in sinks ? Or are there any tips for writing custom serializers,
which are fast / efficient / have no memory leaks etc ?

Secondly, do the incoming syslog messages get parsed correctly, independent
of if they are BSD style (RFC5424)
or the newer style (RFC3164) ?

thanks