You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Preston Roy <Pr...@telus.com> on 2017/07/10 13:49:31 UTC

TAILDIR source help

Hello all,

I am sending this email as I am having difficulty using the TAILDIR source in flume.

Setup: Two servers, one with a .txt file that gets lines appended to it regularly.

Goal: Use flume TAILDIR source to append the most recently written line to a file on the other server.

Issue: Whenever the source file has a new line of data added, the current configuration appends everything in  file on server 1 to the file in server 2. This results in duplicate lines in file 2 and does not properly recreate the file from server 1.

Configuration on server 1(observing the data) :

#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1

#using memort channel to hold upto 1000 events
agent.channels.k1.type=memory
agent.channels.k1.capacity=1000
agent.channels.k1.transactionCapacity=100

#connect source, channel,sink
agent.sources.r1.channels=k1
agent.sinks.c1.channel=k1

#define source
agent.sources.r1.type=TAILDIR
agent.sources.r1.channels=k1
agent.sources.r1.filegroups=f1
agent.sources.r1.filegroups.f1=/home/preston/Documents/tail_test_dir/test.txt
agent.sources.r1.maxBackoffSleep=1000

#connect to another box using avro and send the data
agent.sinks.c1.type=avro
agent.sinks.c1.hostname=10.10.10.4
agent.sinks.c1.port=4545

Configuration on server 2 (appending data to a file):

                #THIS ONE WRITES TO A FILE
#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1


#using memory channel to hold up to 1000 events
agent.channels.k1.type=memory
agent.channels.k1.capacity=1000
agent.channels.k1.transactionCapacity=100

#connect source, channel, sink
agent.sources.r1.channels=k1
agent.sinks.c1.channel=k1

#here source is listening at the specified port using AVRO for data
agent.sources.r1.type=avro
agent.sources.r1.bind=0.0.0.0
agent.sources.r1.port=4545

#We use file_roll and write file at specified directory
agent.sinks.c1.type=file_roll
agent.sinks.c1.sink.directory=/home/preston/Documents/Flume_dump
                #this is the path where you want Flume to dump results


Any help at all is greatly appreciated!

Thank you,

Preston Roy, BSc in Electrical Engineering
Co-op student
Network and Video Services
M +1 (587) 228 8520
Member of the TELUS team
the future is friendly(r)