You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flume.apache.org by "Confuse (Jira)" <ji...@apache.org> on 2020/01/09 14:06:00 UTC

[jira] [Created] (FLUME-3351) Taildir source data duplication

Confuse created FLUME-3351:
------------------------------

             Summary: Taildir source data duplication
                 Key: FLUME-3351
                 URL: https://issues.apache.org/jira/browse/FLUME-3351
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: 1.9.0
            Reporter: Confuse


If the server restarts abnormally, taildir source may read data repeatedly. It's easy to replicate this phenomenon,such as using the command: reboot. 
below is my recurrence scenario:
Agent one is deployed on server one, and it is configured taildir source, file channel, avro sink. While agent two is deployed on server two,  and agent two is configured avro source, file channel, hdfs sink.  Thistwo agents are connected by avro. It means agent two receives data from agent one. Then i reboot server one, data on HDFS must be repeated after server one recovery from failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org