You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Roman Shaposhnik <rv...@apache.org> on 2012/12/01 01:40:21 UTC

Re: Flume and HDFS integration

On Fri, Nov 30, 2012 at 12:51 AM, Emile Kao <em...@gmx.net> wrote:
> Hello Brock,
> first of all thank you for answering my questions. I appreciate it since I am a real newbie in Flume / Hadoop , etc...
>
> But now I am confused. According to you statement, the filetype is the key here. Now just take a look on my flume.conf below:
> The filetype was from set to "DataStream".
> Now which is the right one now: SequenceFile, DataStream or CompressedStream?

Here's what works for me in the situation very similar to yours:

# Sink configuration
agent.sinks.sink1.type = hdfs
agent.sinks.sink1.hdfs.path = /flume/cluster-logs
agent.sinks.sink1.hdfs.writeFormat = Text
agent.sinks.sink1.hdfs.fileType = DataStream
agent.sinks.sink1.hdfs.filePrefix = events-
agent.sinks.sink1.hdfs.round = true
agent.sinks.sink1.hdfs.roundValue = 10
agent.sinks.sink1.hdfs.roundUnit = minute
# agent.sinks.sink1.hdfs.serializer =
org.apache.flume.serialization.BodyTextEventSerializer

Thanks,
Roman.