You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by orahad bigdata <or...@gmail.com> on 2014/02/27 16:49:05 UTC

Flume log event per file

Hi All,

I'm new in flume , I have a small Hadoop setup and flume agent on that,I'm
using tail -f logfilename file as a source.

When I started the agent it is ingesting data into hdfs, but each file only
contains 10 lines can we configure the number of line per file on hdfs?

below is my agent conf file.


agent.sources = pstream
agent.channels = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 100000
agent.channels.memoryChannel.transactionCapacity = 10000
agent.sources.pstream.channels = memoryChannel
agent.sources.pstream.type = exec
agent.sources.pstream.command = tail -f /root/dummylog
agent.sources.pstream.batchSize=1000
agent.sinks = hdfsSink
agent.sinks.hdfsSink.type = hdfs
agent.sinks.hdfsSink.channel = memoryChannel
agent.sinks.hdfsSink.hdfs.path = hdfs://xxxxx:xxx/somepath
agent.sinks.hdfsSink.hdfs.fileType = DataStream
agent.sinks.hdfsSink.hdfs.writeFormat = Text


Thanks

Re: Flume log event per file

Posted by Jeff Lord <jl...@cloudera.com>.
It looks like you have not configured any properties for "rolling"
files on hdfs.
The default rollCount is 10 (events).

http://flume.apache.org/FlumeUserGuide.html#hdfs-sink

The flume hdfs sink can be configured to roll based on size, # of
events, or time.

hdfs.rollInterval30Number of seconds to wait before rolling current
file (0 = never roll based on time interval)
hdfs.rollSize1024File size to trigger roll, in bytes (0: never roll
based on file size)
hdfs.rollCount10Number of events written to file before it rolled (0 =
never roll based on number of events)

On Thu, Feb 27, 2014 at 7:49 AM, orahad bigdata <or...@gmail.com> wrote:
> Hi All,
>
> I'm new in flume , I have a small Hadoop setup and flume agent on that,I'm
> using tail -f logfilename file as a source.
>
> When I started the agent it is ingesting data into hdfs, but each file only
> contains 10 lines can we configure the number of line per file on hdfs?
>
> below is my agent conf file.
>
>
> agent.sources = pstream
> agent.channels = memoryChannel
> agent.channels.memoryChannel.type = memory
> agent.channels.memoryChannel.capacity = 100000
> agent.channels.memoryChannel.transactionCapacity = 10000
> agent.sources.pstream.channels = memoryChannel
> agent.sources.pstream.type = exec
> agent.sources.pstream.command = tail -f /root/dummylog
> agent.sources.pstream.batchSize=1000
> agent.sinks = hdfsSink
> agent.sinks.hdfsSink.type = hdfs
> agent.sinks.hdfsSink.channel = memoryChannel
> agent.sinks.hdfsSink.hdfs.path = hdfs://xxxxx:xxx/somepath
> agent.sinks.hdfsSink.hdfs.fileType = DataStream
> agent.sinks.hdfsSink.hdfs.writeFormat = Text
>
>
> Thanks