You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Jonathan Cooper-Ellis (JIRA)" <ji...@apache.org> on 2013/07/03 19:16:20 UTC

[jira] [Updated] (FLUME-2119) duplicate files cause flume to enter irrecoverable state

     [ https://issues.apache.org/jira/browse/FLUME-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Cooper-Ellis updated FLUME-2119:
-----------------------------------------

    Description: 
If a spoolingdir receives FileA, after it is picked up by Flume and renamed to FileA.COMPLETED placing another file of the same original name (FileA) will cause Flume to log an IllegalStateException indefinitely. This is likely due to Flume attempting to rename the second FileA to FileA.COMPLETED, but finding that the file already exists.

When Flume has entered this state, it can only be recovered by removing the .COMPLETED file from the directory and restarting the agent.

Log message looks like this:

02 Jul 2013 21:32:09,371 ERROR [pool-4-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:164)  - Uncaught exception in Runnable
java.lang.IllegalStateException: Serializer has been closed
        at org.apache.flume.serialization.LineDeserializer.ensureOpen(LineDeserializer.java:124)
        at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:88)
        at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
        at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

  was:
If a spoolingdir receives FileA, after it is picked up by Flume and renamed to FileA.COMPLETED placing another file of the same original name (FileA) will cause Flume to log an IllegalStateException indefinitely. This is likely due to Flume attempting to rename the second FileA to FileA.COMPLETED, but finding that the file already exists.

When Flume has entered this state, it can only be recovered by removing the .COMPLETED file from the directory and restarting the agent.

    
> duplicate files cause flume to enter irrecoverable state
> --------------------------------------------------------
>
>                 Key: FLUME-2119
>                 URL: https://issues.apache.org/jira/browse/FLUME-2119
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>            Reporter: Jonathan Cooper-Ellis
>
> If a spoolingdir receives FileA, after it is picked up by Flume and renamed to FileA.COMPLETED placing another file of the same original name (FileA) will cause Flume to log an IllegalStateException indefinitely. This is likely due to Flume attempting to rename the second FileA to FileA.COMPLETED, but finding that the file already exists.
> When Flume has entered this state, it can only be recovered by removing the .COMPLETED file from the directory and restarting the agent.
> Log message looks like this:
> 02 Jul 2013 21:32:09,371 ERROR [pool-4-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:164)  - Uncaught exception in Runnable
> java.lang.IllegalStateException: Serializer has been closed
>         at org.apache.flume.serialization.LineDeserializer.ensureOpen(LineDeserializer.java:124)
>         at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:88)
>         at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
>         at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira