You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@camel.apache.org by "Philippe Khalife (JIRA)" <ji...@apache.org> on 2016/01/04 21:36:39 UTC

[jira] [Updated] (CAMEL-9478) GenericFileProcessStrategySupport renameFile is destructive when the source file is missing

     [ https://issues.apache.org/jira/browse/CAMEL-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Philippe Khalife updated CAMEL-9478:
------------------------------------
    Environment: Windows, OSX, JVM 1.8.0_20  (was: Windows, OSX)
         Labels: file idempotent threading  (was: )
    Description: 
Given this setup:

String fileConsumerURI="file://spool?move=.done&doneFileName=${file:name}.done&sortBy=file:modified&maxMessagesPerPoll=5000&eagerMaxMessagesPerPoll=true"

RouteDefinition route = from(fileConsumerURI).routeId("MyRouteId").to("metrics:counter:MyRoute.counter?increment=1");

route.idempotentConsumer(header("CamelFileName"))
                                        .messageIdRepository(memoryIdempotentRepository)
                                        .threads(CONFIGURABLE_NUMBER)
                                        .unmarshal()
                                        .serialization()
                                        .process(MY_SIMPLE_PROCESSOR)
                                        .recipientList(header("HEADER_CREATED_BY_MY_SIMPLE_PROCESSOR"));


Can occasionally lead to a destructive action by "GenericFileProcessStrategySupport.renameFile" when it attempts to delete the "to.getAbsoluteFilePath()" while the source file "from.getAbsoluteFilePath()" does not even exist.
I think is a bug on its own regardless of how we got there. The GenericFileProcessStrategySupport.renameFile method should check that from.getAbsoluteFilePath() exists BEFORE it attempts the delete. 

How did we get there? - I need to do multi-threaded processing of stored files in a spool, the files  contain meta data about their intended destination, AND using a configurable number of threads to do the work.
Why the idempotentConsumer in the route? I needed to control the size of the memoryIdempotentRepository, which is NOT configurable in the file component.

In this setup the file component seems to have a tendency to occasionally send a file to processing more than once, the idempotentConsumer takes care of that pretty well, but NOT when I choose to move the files instead of delete them as per the route above.

This can be reproduced on its own by running a lot of data and having varied delays in the processing. After much waiting and catching the real bug, I came up with a simulation for it.

By debugging and suspending I can simulate when the problem happens. This setup shows the consequences of files being pushed in exchanges more than once by the file consumer.

Let file F1 get processed, and moved, copy it back to the spool manually (this part is just to force the file being listed for processing more than once), now the idempotentConsumer will reject it, while suspeneded manually move/delete F1 file from the spool, now the file consumer's GenericFileProcessStrategySupport.renameFile will end up eagerly deleting an already processed / moved files while the F1 file does not exist.  

I did on a temporary basis use the file consumer's idempotent setting to avoid this issue as a workaround, but this is not ideal for my workflow.

I have a number of recommendations:
1) Fix this bug: The GenericFileProcessStrategySupport.renameFile method should check that from.getAbsoluteFilePath() exists BEFORE it attempts the delete. 
2) Have a configurable idempotent repository for the file consumer.


A sample stack trace:

WARN  GenericFileOnCompletion:135 - Error during commit. Exchange[ID-I-local-53638-1450384044069-0-170][ID-I-local-53638-1450384044069-0-5]. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Cannot rename file: GenericFile[ID-I-local-53638-1450384044069-0-5] to: GenericFile[.sent/ID-I-local-53638-1450384044069-0-5]]
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot
rename file: GenericFile[ID-I-local-53638-1450384044069-0-5] to:
GenericFile[.sent/ID-I-local-53638-1450384044069-0-5]
        at
org.apache.camel.component.file.strategy.GenericFileProcessStrategySupport.renameFile(GenericFileProcessStrategySupport.java:115)
        at
org.apache.camel.component.file.strategy.GenericFileRenameProcessStrategy.commit(GenericFileRenameProcessStrategy.java:88)
        at
org.apache.camel.component.file.GenericFileOnCompletion.processStrategyCommit(GenericFileOnCompletion.java:127)
        at
org.apache.camel.component.file.GenericFileOnCompletion.onCompletion(GenericFileOnCompletion.java:83)
        at
org.apache.camel.component.file.GenericFileOnCompletion.onComplete(GenericFileOnCompletion.java:57)
        at
org.apache.camel.util.UnitOfWorkHelper.doneSynchronizations(UnitOfWorkHelper.java:104)
        at
org.apache.camel.impl.DefaultUnitOfWork.done(DefaultUnitOfWork.java:229)
        at
org.apache.camel.util.UnitOfWorkHelper.doneUow(UnitOfWorkHelper.java:65)
        at
org.apache.camel.processor.CamelInternalProcessor$UnitOfWorkProcessorAdvice.after(CamelInternalProcessor.java:653)
        at
org.apache.camel.processor.CamelInternalProcessor$UnitOfWorkProcessorAdvice.after(CamelInternalProcessor.java:608)
        at
org.apache.camel.processor.CamelInternalProcessor$InternalCallback.done(CamelInternalProcessor.java:239)
        at org.apache.camel.processor.Pipeline.process(Pipeline.java:109)
        at
org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
        at
org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:442)
        at
org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:214)
        at
org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:178)
        at
org.apache.camel.impl.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:174)
        at
org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:101)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


  was:

Given this setup:

String fileConsumerURI="file://spool?move=.done&doneFileName=${file:name}.done&sortBy=file:modified&maxMessagesPerPoll=5000&eagerMaxMessagesPerPoll=true"

RouteDefinition route = from(fileConsumerURI).routeId("MyRouteId").to("metrics:counter:MyRoute.counter?increment=1");

route.idempotentConsumer(header("CamelFileName"))
                                        .messageIdRepository(memoryIdempotentRepository)
                                        .threads(CONFIGURABLE_NUMBER)
                                        .unmarshal()
                                        .serialization()
                                        .process(MY_SIMPLE_PROCESSOR)
                                        .recipientList(header("HEADER_CREATED_BY_MY_SIMPLE_PROCESSOR"));


Can occasionally lead to a destructive action by "GenericFileProcessStrategySupport.renameFile" when it attempts to delete the "to.getAbsoluteFilePath()" while the source file "from.getAbsoluteFilePath()" does not even exist.
I think is a bug on its own regardless of how we got there. The GenericFileProcessStrategySupport.renameFile method should check that from.getAbsoluteFilePath() exists BEFORE it attempts the delete. 

How did we get there? - I need to do multi-threaded processing of stored files in a spool, the files  contain meta data about their intended destination, AND using a configurable number of threads to do the work.
Why the idempotentConsumer in the route? I needed to control the size of the memoryIdempotentRepository, which is NOT configurable in the file component.

In this setup the file component seems to have a tendency to occasionally send a file to processing more than once, the idempotentConsumer takes care of that pretty well, but NOT when I choose to move the files instead of delete them as per the route above.

This can be reproduced on its own by running a lot of data and having varied delays in the processing. After much waiting and catching the real bug, I came up with a simulation for it.

By debugging and suspending I can simulate when the problem happens. This setup shows the consequences of files being pushed in exchanges more than once by the file consumer.

Let file F1 get processed, and moved, copy it back to the spool manually (this part is just to force the file being listed for processing more than once), now the idempotentConsumer will reject it, while suspeneded manually move/delete F1 file from the spool, now the file consumer's GenericFileProcessStrategySupport.renameFile will end up eagerly deleting an already processed / moved files while the F1 file does not exist.  

I did on a temporary basis use the file consumer's idempotent setting to avoid this issue as a workaround, but this is not ideal for my workflow.

I have a number of recommendations:
1) Fix this bug: The GenericFileProcessStrategySupport.renameFile method should check that from.getAbsoluteFilePath() exists BEFORE it attempts the delete. 
2) Have a configurable idempotent repository for the file consumer.


A sample stack trace:

WARN  GenericFileOnCompletion:135 - Error during commit. Exchange[ID-I-local-53638-1450384044069-0-170][ID-I-local-53638-1450384044069-0-5]. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Cannot rename file: GenericFile[ID-I-local-53638-1450384044069-0-5] to: GenericFile[.sent/ID-I-local-53638-1450384044069-0-5]]
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot
rename file: GenericFile[ID-I-local-53638-1450384044069-0-5] to:
GenericFile[.sent/ID-I-local-53638-1450384044069-0-5]
        at
org.apache.camel.component.file.strategy.GenericFileProcessStrategySupport.renameFile(GenericFileProcessStrategySupport.java:115)
        at
org.apache.camel.component.file.strategy.GenericFileRenameProcessStrategy.commit(GenericFileRenameProcessStrategy.java:88)
        at
org.apache.camel.component.file.GenericFileOnCompletion.processStrategyCommit(GenericFileOnCompletion.java:127)
        at
org.apache.camel.component.file.GenericFileOnCompletion.onCompletion(GenericFileOnCompletion.java:83)
        at
org.apache.camel.component.file.GenericFileOnCompletion.onComplete(GenericFileOnCompletion.java:57)
        at
org.apache.camel.util.UnitOfWorkHelper.doneSynchronizations(UnitOfWorkHelper.java:104)
        at
org.apache.camel.impl.DefaultUnitOfWork.done(DefaultUnitOfWork.java:229)
        at
org.apache.camel.util.UnitOfWorkHelper.doneUow(UnitOfWorkHelper.java:65)
        at
org.apache.camel.processor.CamelInternalProcessor$UnitOfWorkProcessorAdvice.after(CamelInternalProcessor.java:653)
        at
org.apache.camel.processor.CamelInternalProcessor$UnitOfWorkProcessorAdvice.after(CamelInternalProcessor.java:608)
        at
org.apache.camel.processor.CamelInternalProcessor$InternalCallback.done(CamelInternalProcessor.java:239)
        at org.apache.camel.processor.Pipeline.process(Pipeline.java:109)
        at
org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
        at
org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:442)
        at
org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:214)
        at
org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:178)
        at
org.apache.camel.impl.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:174)
        at
org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:101)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)



> GenericFileProcessStrategySupport renameFile is destructive when the source file is missing
> -------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-9478
>                 URL: https://issues.apache.org/jira/browse/CAMEL-9478
>             Project: Camel
>          Issue Type: Bug
>    Affects Versions: 2.16.0
>         Environment: Windows, OSX, JVM 1.8.0_20
>            Reporter: Philippe Khalife
>              Labels: file, idempotent, threading
>
> Given this setup:
> String fileConsumerURI="file://spool?move=.done&doneFileName=${file:name}.done&sortBy=file:modified&maxMessagesPerPoll=5000&eagerMaxMessagesPerPoll=true"
> RouteDefinition route = from(fileConsumerURI).routeId("MyRouteId").to("metrics:counter:MyRoute.counter?increment=1");
> route.idempotentConsumer(header("CamelFileName"))
>                                         .messageIdRepository(memoryIdempotentRepository)
>                                         .threads(CONFIGURABLE_NUMBER)
>                                         .unmarshal()
>                                         .serialization()
>                                         .process(MY_SIMPLE_PROCESSOR)
>                                         .recipientList(header("HEADER_CREATED_BY_MY_SIMPLE_PROCESSOR"));
> Can occasionally lead to a destructive action by "GenericFileProcessStrategySupport.renameFile" when it attempts to delete the "to.getAbsoluteFilePath()" while the source file "from.getAbsoluteFilePath()" does not even exist.
> I think is a bug on its own regardless of how we got there. The GenericFileProcessStrategySupport.renameFile method should check that from.getAbsoluteFilePath() exists BEFORE it attempts the delete. 
> How did we get there? - I need to do multi-threaded processing of stored files in a spool, the files  contain meta data about their intended destination, AND using a configurable number of threads to do the work.
> Why the idempotentConsumer in the route? I needed to control the size of the memoryIdempotentRepository, which is NOT configurable in the file component.
> In this setup the file component seems to have a tendency to occasionally send a file to processing more than once, the idempotentConsumer takes care of that pretty well, but NOT when I choose to move the files instead of delete them as per the route above.
> This can be reproduced on its own by running a lot of data and having varied delays in the processing. After much waiting and catching the real bug, I came up with a simulation for it.
> By debugging and suspending I can simulate when the problem happens. This setup shows the consequences of files being pushed in exchanges more than once by the file consumer.
> Let file F1 get processed, and moved, copy it back to the spool manually (this part is just to force the file being listed for processing more than once), now the idempotentConsumer will reject it, while suspeneded manually move/delete F1 file from the spool, now the file consumer's GenericFileProcessStrategySupport.renameFile will end up eagerly deleting an already processed / moved files while the F1 file does not exist.  
> I did on a temporary basis use the file consumer's idempotent setting to avoid this issue as a workaround, but this is not ideal for my workflow.
> I have a number of recommendations:
> 1) Fix this bug: The GenericFileProcessStrategySupport.renameFile method should check that from.getAbsoluteFilePath() exists BEFORE it attempts the delete. 
> 2) Have a configurable idempotent repository for the file consumer.
> A sample stack trace:
> WARN  GenericFileOnCompletion:135 - Error during commit. Exchange[ID-I-local-53638-1450384044069-0-170][ID-I-local-53638-1450384044069-0-5]. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Cannot rename file: GenericFile[ID-I-local-53638-1450384044069-0-5] to: GenericFile[.sent/ID-I-local-53638-1450384044069-0-5]]
> org.apache.camel.component.file.GenericFileOperationFailedException: Cannot
> rename file: GenericFile[ID-I-local-53638-1450384044069-0-5] to:
> GenericFile[.sent/ID-I-local-53638-1450384044069-0-5]
>         at
> org.apache.camel.component.file.strategy.GenericFileProcessStrategySupport.renameFile(GenericFileProcessStrategySupport.java:115)
>         at
> org.apache.camel.component.file.strategy.GenericFileRenameProcessStrategy.commit(GenericFileRenameProcessStrategy.java:88)
>         at
> org.apache.camel.component.file.GenericFileOnCompletion.processStrategyCommit(GenericFileOnCompletion.java:127)
>         at
> org.apache.camel.component.file.GenericFileOnCompletion.onCompletion(GenericFileOnCompletion.java:83)
>         at
> org.apache.camel.component.file.GenericFileOnCompletion.onComplete(GenericFileOnCompletion.java:57)
>         at
> org.apache.camel.util.UnitOfWorkHelper.doneSynchronizations(UnitOfWorkHelper.java:104)
>         at
> org.apache.camel.impl.DefaultUnitOfWork.done(DefaultUnitOfWork.java:229)
>         at
> org.apache.camel.util.UnitOfWorkHelper.doneUow(UnitOfWorkHelper.java:65)
>         at
> org.apache.camel.processor.CamelInternalProcessor$UnitOfWorkProcessorAdvice.after(CamelInternalProcessor.java:653)
>         at
> org.apache.camel.processor.CamelInternalProcessor$UnitOfWorkProcessorAdvice.after(CamelInternalProcessor.java:608)
>         at
> org.apache.camel.processor.CamelInternalProcessor$InternalCallback.done(CamelInternalProcessor.java:239)
>         at org.apache.camel.processor.Pipeline.process(Pipeline.java:109)
>         at
> org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
>         at
> org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:442)
>         at
> org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:214)
>         at
> org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:178)
>         at
> org.apache.camel.impl.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:174)
>         at
> org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:101)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)