You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Hari Shreedharan (JIRA)" <ji...@apache.org> on 2014/02/02 03:22:08 UTC

[jira] [Commented] (FLUME-2309) Spooling directory should not always consume the oldest file first.

    [ https://issues.apache.org/jira/browse/FLUME-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888801#comment-13888801 ] 

Hari Shreedharan commented on FLUME-2309:
-----------------------------------------

Makes sense to me. It should be a simple fix, but we must document it - since older files might get sent only very late. Maybe we can me it smarter even when we sort? Reuse the sorted list until it is empty to avoid a sort every time (not sure if we already do this).

> Spooling directory should not always consume the oldest file first.
> -------------------------------------------------------------------
>
>                 Key: FLUME-2309
>                 URL: https://issues.apache.org/jira/browse/FLUME-2309
>             Project: Flume
>          Issue Type: New Feature
>            Reporter: Muhammad Ehsan ul Haque
>            Priority: Minor
>
> The ReliableSpoolingFileEventReader reads the oldest file in the spooling directory first. This is done by listing the directory contents and then sorting file list based on timestamp. This may be very slow if there are a lot of files (of the order of 100K or more) in the directory.
> However, this is not always needed, there can be simple cases in which the order to consume the file is not important.
> There should be an option of consuming the files in arbitrary order, allowing the files to be consumed quickly without any delay.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)