You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flume.apache.org by "Confuse (Jira)" <ji...@apache.org> on 2020/01/10 01:07:00 UTC
[jira] [Updated] (FLUME-3351) Taildir source data duplication

     [ https://issues.apache.org/jira/browse/FLUME-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Confuse updated FLUME-3351:
---------------------------
    Description: 
If the server restarts abnormally, taildir source may read data repeatedly. It's easy to replicate this phenomenon，such as using the command: reboot. 
below is my recurrence scenario:
Agent one is deployed on server one, and it is configured taildir source, file channel, avro sink. While agent two is deployed on server two,  and agent two is configured avro source, file channel, hdfs sink.  This two agents are connected by avro. It means agent two receives data from agent one. Then i reboot server one, data on HDFS must be repeated after server one recovery from failure.

  was:
If the server restarts abnormally, taildir source may read data repeatedly. It's easy to replicate this phenomenon，such as using the command: reboot. 
below is my recurrence scenario:
Agent one is deployed on server one, and it is configured taildir source, file channel, avro sink. While agent two is deployed on server two,  and agent two is configured avro source, file channel, hdfs sink.  Thistwo agents are connected by avro. It means agent two receives data from agent one. Then i reboot server one, data on HDFS must be repeated after server one recovery from failure.


> Taildir source data duplication
> -------------------------------
>
>                 Key: FLUME-3351
>                 URL: https://issues.apache.org/jira/browse/FLUME-3351
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.9.0
>            Reporter: Confuse
>            Priority: Major
>
> If the server restarts abnormally, taildir source may read data repeatedly. It's easy to replicate this phenomenon，such as using the command: reboot. 
> below is my recurrence scenario:
> Agent one is deployed on server one, and it is configured taildir source, file channel, avro sink. While agent two is deployed on server two,  and agent two is configured avro source, file channel, hdfs sink.  This two agents are connected by avro. It means agent two receives data from agent one. Then i reboot server one, data on HDFS must be repeated after server one recovery from failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org