You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Satoshi Iijima (JIRA)" <ji...@apache.org> on 2015/08/29 04:33:45 UTC

[jira] [Commented] (FLUME-2777) Tail Dir Source leads to duplicate events on rolling the tailed file

    [ https://issues.apache.org/jira/browse/FLUME-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720927#comment-14720927 ] 

Satoshi Iijima commented on FLUME-2777:
---------------------------------------

I recommend to adjust a file path regex not to include the renamed file path.
When some of files are truncated (deleted or archived to compressed file) and a new file is generated, the new file can occasionally have the same inode as the truncated file. It need be read as a new file from pos 0.
I think that it is difficult to completely distinguish a new file which have the same inode as a tailing file from a renamed file.


> Tail Dir Source leads to duplicate events on rolling the tailed file
> --------------------------------------------------------------------
>
>                 Key: FLUME-2777
>                 URL: https://issues.apache.org/jira/browse/FLUME-2777
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.7
>            Reporter: Johny Rufus
>            Assignee: Johny Rufus
>         Attachments: FLUME-2777.patch
>
>
> I have a simple setup, where I write 200 events to logfile1. [TailSrc is on the lookout for logfile* ]
> Then I rename logfile1 to logfile2.
> I create a new logfile1 and write 100 events to it.
> Typically I should see 300 events in my channel. But I see 500 events.
> I was able to trace the duplicates to ReliableTaildirEventReader.java updateFiles(boolean) to the way renamed files are handled , by specifying starting position as 0. [This starting position should be obtained from tf.getPosition()]
> I am attaching a proposed fix, would be great if one of you guys [~iijima_satoshi] / [~hshreedharan]/ [~roshan_naik] can take a look at the fix and validate the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)