You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by "Tushar Gosavi (JIRA)" <ji...@apache.org> on 2016/02/15 06:07:18 UTC

[jira] [Commented] (APEXCORE-343) Add property to AbstractFileInputOperator to trim processedFiles and ignoredFiles

    [ https://issues.apache.org/jira/browse/APEXCORE-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146913#comment-15146913 ] 

Tushar Gosavi commented on APEXCORE-343:
----------------------------------------

One more option we should also provide is to allow users to manage these lists through external database. This will allow users to restart the application if previous application state is lost or gets corrupted. 

> Add property to AbstractFileInputOperator to trim processedFiles and ignoredFiles
> ---------------------------------------------------------------------------------
>
>                 Key: APEXCORE-343
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-343
>             Project: Apache Apex Core
>          Issue Type: Improvement
>    Affects Versions: 3.3.0
>            Reporter: Munagala V. Ramanath
>            Assignee: Munagala V. Ramanath
>
> In AbstractFileInputOperator, the processedFiles and DirectoryScanner.ignoredFiles sets can continue to grow without bound as new files are added to the monitored directory. There are scenarios where an input file is deleted by the application once it is processed; here, it is useful to provide a property that, if true, will cause the scanner to remove deleted files from these sets to prevent unbounded growth.
> The only way to do this currently is to extend DirectoryScanner and provide implementations of the scan() and createPartition() methods which forces the operator writer to unnecessarily grapple with the internals of the base class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)