You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Gaurav Kumar (JIRA)" <ji...@apache.org> on 2014/09/18 11:28:34 UTC

[jira] [Commented] (FLUME-2066) Spool directory source can get stuck in a "Serializer has been closed" loop when retireCurrentFile throws an exception

    [ https://issues.apache.org/jira/browse/FLUME-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138721#comment-14138721 ] 

Gaurav Kumar commented on FLUME-2066:
-------------------------------------

Are there plans for fixing this issue? I am observing the exception when I try to copy a large file (1GB+) to spool dir using cp command. Something like- 
{{cp /sourceDir/LargeFile.txt /flumeSpoolDir}}

What is probably happening is that Linux is copying files buffer by buffer which is changing the size of the file and thus triggering error condition. In case of smaller files, even before Flume can detect file change, file has been fully copied.

To work around this issue, I am streaming the large file using nc command to Flame's netcat source. Are there better alternatives? 

> Spool directory source can get stuck in a "Serializer has been closed" loop when retireCurrentFile throws an exception
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-2066
>                 URL: https://issues.apache.org/jira/browse/FLUME-2066
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.4.0, v1.3.1
>            Reporter: Phil Scala
>            Assignee: Phil Scala
>
> The following 2 java files have similar code and are affected by this issue... 
> 1.31. SpoolingfileLineReader.java 
> 1.4 ReliableSpoolingFileEventReader.java 
> retireCurrentFile is called by 1 caller (readLines in 1.3.1 and readEvents in 1.4) 
> {code:java} 
> retireCurrentFile(); 
>       currentFile = getNextFile(); 
>       if (!currentFile.isPresent()) { 
>         return Collections.emptyList(); 
>       } 
> {code} 
> if retireCurrentFile throws an exception after closing the reader (there are a few causes for an exception tobe raised which are described below) the the currentFile still points to the attempted to be retired file. This causes subsequent calls to readLines/readEvents to raise a "Serializer has been closed" exception. At this point the application needs to be shutdown in order to rectify the problem. If Flume is left running for a while, the logs are littered with the error, so you have to go to the initial error logged to understand what happened. 
> *Exceptions raised in "retireCurrentFile()"* 
> IlligalStateException when the file modified date changes 
> IlligalStateException when the size changes 
> IllegalStateException when renaming the current file and the target file already exists (with different sizes) 
> IllegalStateException when renaming the current file and the target file already exists [non windows] 
> FlumeException when renameTo does not return true. 
> The documentation does say: 
> *Warning This channel expects that only immutable, uniquely named files are dropped in the spooling directory. If duplicate names are used, or files are modified while being read, the source will fail with an error message *
> I am not sure however if the intention was to get caught into the "Serializer has been closed" loop. 3 possible solutions: 
> 1. Re-spool the retired file, this will cause duplicates and could get caught in a loop of constantly spooling this file. 
> 2. Log an error and continue spooling the next files. 
> 3. Shutdown 
> I like option..2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)