You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Roshan Naik (JIRA)" <ji...@apache.org> on 2015/12/17 19:45:46 UTC

[jira] [Comment Edited] (FLUME-2801) Performance improvement on TailDir source

    [ https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062513#comment-15062513 ] 

Roshan Naik edited comment on FLUME-2801 at 12/17/15 6:45 PM:
--------------------------------------------------------------

+1 
Thanks [~iijima_satoshi] for the review.  Im running tests.. will commit soon.


was (Author: roshan_naik):
Thanks [~iijima_satoshi] for the review.  Im running tests.. will commit soon.

> Performance improvement on TailDir source
> -----------------------------------------
>
>                 Key: FLUME-2801
>                 URL: https://issues.apache.org/jira/browse/FLUME-2801
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.7.0
>            Reporter: Jun Seok Hong
>            Assignee: Jun Seok Hong
>             Fix For: v1.7.0
>
>         Attachments: FLUME-2801-1.patch, FLUME-2801-2.patch, FLUME-2801.patch
>
>
> This a proposal of performance improvement for new tailing source FLUME-2498.
> Taildir source reads a file by 1byte, so the performance is very low compared to tailing on exec source.
> I tested lot's of ways to improve performance and implemented the best one.
> Changes.
> * Reading a file by a 8k block instead of 1 byte.
> * Use byte[] for handling data instead of ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance.
> * Don't convert byte[] to string and vice verse.
> Simple file reading test results.
> {quote}
>  File size: 100 MB, 
>  Line size: 500 byte
> Estimated time to read the file:
> |Reading 1byte(Using the code in Taildir)|32544 ms|
> |Reading 8K Block|431 ms|
> {quote}
> Testing on flume, it catches up the performance of tailing on exec source. (30x performance boost)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)