You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Tadas Makčinskas <Ta...@teo.lt> on 2014/06/05 12:00:56 UTC

HDFS not adding \n

I’m getting messages through SyslogUDP and storint them on HDFS. But data at the end is not separated by \n as I would expect
Reslut is stored without \n thus makes it hard to further process with standard tool set


# flume config
tier1.sources.sourceDHCP_Raw.type     = syslogudp
tier1.sources.sourceDHCP_Raw.host     = 0.0.0.0
tier1.sources.sourceDHCP_Raw.port     = 5141

tier1.sources.sourceDHCP_Raw.channels = channelDHCP_Raw

tier1.channels.channelDHCP_Raw.type   = memory
tier1.channels.channelDHCP_Raw.capacity = 100

tier1.sinks.sinkDHCP_Raw.type = hdfs
tier1.sinks.sinkDHCP_Raw.hdfs.path = /flume/TV/DHCP_RAW
tier1.sinks.sinkDHCP_Raw.hdfs.rollInterval = 10000
tier1.sinks.sinkDHCP_Raw.serializer.appendNewline = true
tier1.sinks.sinkDHCP_Raw.channel = channelDHCP_Raw

__What comes through network __

# tcpdump -n udp -A | grep 'ZXAN'
.....B......y..<191>S,3,00029b221441,0.0.0.0,2014-6-5 12:47:14.716,2014-6-5 12:47:14,ZXAN pon 0/ 2/3/ 8/2:6,28734950,0x853DE730
.....B......y..<191>S,3,00029b221441,0.0.0.0,2014-6-5 12:46:27.451,2014-6-5 12:46:27,ZXAN pon 0/ 2/3/ 8/2:6,28734950,0x853DE730


RE: HDFS not adding \n

Posted by Tadas Makčinskas <Ta...@teo.lt>.
I tried that with serializer type, but that did not help.
What actually saved my day was adding:

tier1.sinks.sinkDHCP_Raw.hdfs.writeFormat = Text
tier1.sinks.sinkDHCP_Raw.hdfs.fileType = DataStream

T.

From: Jeff Lord [mailto:jlord@cloudera.com]
Sent: 2014.06.05 19:51
To: user@flume.apache.org
Subject: Re: HDFS not adding \n

Can you try adding this line to your config?

tier1.sinks.sinkDHCP_Raw.serializer = text

Re: HDFS not adding \n

Posted by Jeff Lord <jl...@cloudera.com>.
Can you try adding this line to your config?

tier1.sinks.sinkDHCP_Raw.serializer = text