You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by sanjeev sagar <sa...@gmail.com> on 2013/06/20 22:10:27 UTC
Flume-NG agent issue on daily rotation files

Hello All, I'm trying to load the app servers request logs in Hadoop hdfs.



I get all the consolidate logs in one file for a day. I'm running the flume
agent with following config:



##

agent.sources = apache

agent.sources.apache.type = exec

agent.sources.apache.command = cat

/appserverlogs/requestfile/request.log.2013_06_07

agent.sources.apache.batchSize = 1

agent.sources.apache.channels = memoryChannel
agent.sources.apache.interceptors = itime ihost itype #
http://flume.apache.org/FlumeUserGuide.html#timestamp-interceptor

agent.sources.apache.interceptors.itime.type = timestamp #
http://flume.apache.org/FlumeUserGuide.html#host-interceptor

agent.sources.apache.interceptors.ihost.type = host
agent.sources.apache.interceptors.ihost.useIP = false
agent.sources.apache.interceptors.ihost.hostHeader = host #
http://flume.apache.org/FlumeUserGuide.html#static-interceptor

agent.sources.apache.interceptors.itype.type = static
agent.sources.apache.interceptors.itype.key = log_type
agent.sources.apache.interceptors.itype.value = request_logs



# http://flume.apache.org/FlumeUserGuide.html#memory-channel

agent.channels = memoryChannel

agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 1000
agent.channels.memoryChannel.transactionCapacity = 100
agent.channels.memoryChannel.keep-alive = 3
agent.channels.memoryChannel.byteCapacityBufferPercentage = 20



## Send to Flume Collector on 1.2.3.4 (Hadoop Slave Node) #
http://flume.apache.org/FlumeUserGuide.html#avro-sink

agent.sinks = AvroSink

agent.sinks.AvroSink.type = avro

agent.sinks.AvroSink.channel = memoryChannel agent.sinks.AvroSink.hostname
= h1.vgs.mypoints.com agent.sinks.AvroSink.port = 4545



here you can see that I'm using the cat command with the specific file.

As I said that i get one file a day with the date in it.



Q: How could I mention in the config file to keep rotating the cat file
name in the above config for everyday new file? Currently once the file is
loaded then I've to stop the agent and change the config and run the agent
again.





On the Hadoop slave I've the collector running with the following config:



collector.sources = AvroIn

collector.sources.AvroIn.type = avro

collector.sources.AvroIn.bind = 0.0.0.0

collector.sources.AvroIn.port = 4545

collector.sources.AvroIn.channels = mc1 mc2



## Channels ########################################################

## Source writes to 2 channels, one for each sink (Fan Out)
collector.channels = mc1 mc2



collector.channels.mc1.type = memory

collector.channels.mc1.capacity = 1000

collector.channels.mc1.transactionCapacity = 100
collector.channels.mc1.keep-alive = 3
collector.channels.mc1.byteCapacityBufferPercentage = 20



collector.channels.mc2.type = memory

collector.channels.mc2.capacity = 1000

collector.channels.mc2.transactionCapacity = 100
collector.channels.mc2.keep-alive = 3
collector.channels.mc2.byteCapacityBufferPercentage = 20



## Sinks ###########################################################

collector.sinks = LocalOut HadoopOut



## Write copy to Local Filesystem (Debugging) #
http://flume.apache.org/FlumeUserGuide.html#file-roll-sink

collector.sinks.LocalOut.type = file_roll
collector.sinks.LocalOut.sink.directory = /var/log/flume
collector.sinks.LocalOut.sink.rollInterval = 0
collector.sinks.LocalOut.channel = mc1



## Write to HDFS

# http://flume.apache.org/FlumeUserGuide.html#hdfs-sink

collector.sinks.HadoopOut.type = hdfs

collector.sinks.HadoopOut.channel = mc2

collector.sinks.HadoopOut.hdfs.path =

/user/flume/events/%{log_type}/%{host}/%y-%m-%d

collector.sinks.HadoopOut.hdfs.fileType = DataStream
collector.sinks.HadoopOut.hdfs.writeFormat = Text
collector.sinks.HadoopOut.hdfs.rollSize = 0
collector.sinks.HadoopOut.hdfs.rollCount = 0
collector.sinks.HadoopOut.hdfs.rollInterval = 0



Q: Collector is loading the file into hdfs as .tmp extention. Untill I kill
the collector it dont' rotate the file to normal name. I've played with



collector.sinks.HadoopOut.hdfs.rollSize = 0
collector.sinks.HadoopOut.hdfs.rollCount = 0
collector.sinks.HadoopOut.hdfs.rollInterval = 0



but then it  create many files.I'm looking for creating one file for one
day requestlogs.



I really appreciate any help on this issue.



-Sanjeev







-- 
Sanjeev Sagar

*"**Separate yourself from everything that separates you from others
!" - Nirankari
Baba Hardev Singh ji *

**