You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by sanjeev sagar <sa...@gmail.com> on 2013/06/20 22:10:27 UTC
Flume-NG agent issue on daily rotation files
Hello All, I'm trying to load the app servers request logs in Hadoop hdfs.
I get all the consolidate logs in one file for a day. I'm running the flume
agent with following config:
##
agent.sources = apache
agent.sources.apache.type = exec
agent.sources.apache.command = cat
/appserverlogs/requestfile/request.log.2013_06_07
agent.sources.apache.batchSize = 1
agent.sources.apache.channels = memoryChannel
agent.sources.apache.interceptors = itime ihost itype #
http://flume.apache.org/FlumeUserGuide.html#timestamp-interceptor
agent.sources.apache.interceptors.itime.type = timestamp #
http://flume.apache.org/FlumeUserGuide.html#host-interceptor
agent.sources.apache.interceptors.ihost.type = host
agent.sources.apache.interceptors.ihost.useIP = false
agent.sources.apache.interceptors.ihost.hostHeader = host #
http://flume.apache.org/FlumeUserGuide.html#static-interceptor
agent.sources.apache.interceptors.itype.type = static
agent.sources.apache.interceptors.itype.key = log_type
agent.sources.apache.interceptors.itype.value = request_logs
# http://flume.apache.org/FlumeUserGuide.html#memory-channel
agent.channels = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 1000
agent.channels.memoryChannel.transactionCapacity = 100
agent.channels.memoryChannel.keep-alive = 3
agent.channels.memoryChannel.byteCapacityBufferPercentage = 20
## Send to Flume Collector on 1.2.3.4 (Hadoop Slave Node) #
http://flume.apache.org/FlumeUserGuide.html#avro-sink
agent.sinks = AvroSink
agent.sinks.AvroSink.type = avro
agent.sinks.AvroSink.channel = memoryChannel agent.sinks.AvroSink.hostname
= h1.vgs.mypoints.com agent.sinks.AvroSink.port = 4545
here you can see that I'm using the cat command with the specific file.
As I said that i get one file a day with the date in it.
Q: How could I mention in the config file to keep rotating the cat file
name in the above config for everyday new file? Currently once the file is
loaded then I've to stop the agent and change the config and run the agent
again.
On the Hadoop slave I've the collector running with the following config:
collector.sources = AvroIn
collector.sources.AvroIn.type = avro
collector.sources.AvroIn.bind = 0.0.0.0
collector.sources.AvroIn.port = 4545
collector.sources.AvroIn.channels = mc1 mc2
## Channels ########################################################
## Source writes to 2 channels, one for each sink (Fan Out)
collector.channels = mc1 mc2
collector.channels.mc1.type = memory
collector.channels.mc1.capacity = 1000
collector.channels.mc1.transactionCapacity = 100
collector.channels.mc1.keep-alive = 3
collector.channels.mc1.byteCapacityBufferPercentage = 20
collector.channels.mc2.type = memory
collector.channels.mc2.capacity = 1000
collector.channels.mc2.transactionCapacity = 100
collector.channels.mc2.keep-alive = 3
collector.channels.mc2.byteCapacityBufferPercentage = 20
## Sinks ###########################################################
collector.sinks = LocalOut HadoopOut
## Write copy to Local Filesystem (Debugging) #
http://flume.apache.org/FlumeUserGuide.html#file-roll-sink
collector.sinks.LocalOut.type = file_roll
collector.sinks.LocalOut.sink.directory = /var/log/flume
collector.sinks.LocalOut.sink.rollInterval = 0
collector.sinks.LocalOut.channel = mc1
## Write to HDFS
# http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
collector.sinks.HadoopOut.type = hdfs
collector.sinks.HadoopOut.channel = mc2
collector.sinks.HadoopOut.hdfs.path =
/user/flume/events/%{log_type}/%{host}/%y-%m-%d
collector.sinks.HadoopOut.hdfs.fileType = DataStream
collector.sinks.HadoopOut.hdfs.writeFormat = Text
collector.sinks.HadoopOut.hdfs.rollSize = 0
collector.sinks.HadoopOut.hdfs.rollCount = 0
collector.sinks.HadoopOut.hdfs.rollInterval = 0
Q: Collector is loading the file into hdfs as .tmp extention. Untill I kill
the collector it dont' rotate the file to normal name. I've played with
collector.sinks.HadoopOut.hdfs.rollSize = 0
collector.sinks.HadoopOut.hdfs.rollCount = 0
collector.sinks.HadoopOut.hdfs.rollInterval = 0
but then it create many files.I'm looking for creating one file for one
day requestlogs.
I really appreciate any help on this issue.
-Sanjeev
--
Sanjeev Sagar
*"**Separate yourself from everything that separates you from others
!" - Nirankari
Baba Hardev Singh ji *
**