You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Fengdong Yu <az...@gmail.com> on 2013/01/07 06:57:04 UTC

Review Request: ExecSource don't flush the cache if there is no input entries

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8854/
-----------------------------------------------------------

Review request for Flume.


Description
-------

ExecSource has a default batchSize: 20, exec source read data from the source, then put it into the cache, after the cache is full, push it to the channel.

but if exec source's cache is not full, and there isn't any input for a long time, then these entries always kept in the cache, there is no chance to the channel until the source's cache is full.

so, the patch added a new config line: batchTimeout for ExecSource, and default is 3 seconds, if batchTimeout exceeded, push all cached data to the channel even the cache is not full.


Diffs
-----

  flume-ng-core/src/main/java/org/apache/flume/source/ExecSource.java 495b03f 
  flume-ng-core/src/main/java/org/apache/flume/source/ExecSourceConfigurationConstants.java 1b35b01 

Diff: https://reviews.apache.org/r/8854/diff/


Testing
-------


Thanks,

Fengdong Yu


Re: Review Request: ExecSource don't flush the cache if there is no input entries

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8854/#review15125
-----------------------------------------------------------


Thanks for the patch!

I like the idea, but it does not look like this approach is sufficient - since the timeout is checked only when a new line is written out. If no lines are written out, after a few initial are written, the flush never happens.

Also, please add a unit test for the feature.


flume-ng-core/src/main/java/org/apache/flume/source/ExecSource.java
<https://reviews.apache.org/r/8854/#comment32717>

    How does this help? The readLine() method would block until the next line is read from the process's stdout right? So if the process writes only batchSize - 1 events before timeout and then never writes, the source would still not flush right? You probably need to add another thread to make sure the flush has happened. 
    
    Also when you do this, you need to be careful about synchronization - you will probably need to put this inside a synchronized block or lock or something and put the timeout flush code in the same lock/synchronized block.


- Hari Shreedharan


On Jan. 7, 2013, 5:57 a.m., Fengdong Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8854/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2013, 5:57 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> ExecSource has a default batchSize: 20, exec source read data from the source, then put it into the cache, after the cache is full, push it to the channel.
> 
> but if exec source's cache is not full, and there isn't any input for a long time, then these entries always kept in the cache, there is no chance to the channel until the source's cache is full.
> 
> so, the patch added a new config line: batchTimeout for ExecSource, and default is 3 seconds, if batchTimeout exceeded, push all cached data to the channel even the cache is not full.
> 
> 
> Diffs
> -----
> 
>   flume-ng-core/src/main/java/org/apache/flume/source/ExecSource.java 495b03f 
>   flume-ng-core/src/main/java/org/apache/flume/source/ExecSourceConfigurationConstants.java 1b35b01 
> 
> Diff: https://reviews.apache.org/r/8854/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Fengdong Yu
> 
>


Re: Review Request: ExecSource don't flush the cache if there is no input entries

Posted by Denny Ye <de...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8854/#review15108
-----------------------------------------------------------

Ship it!


Ship It!

- Denny Ye


On Jan. 7, 2013, 5:57 a.m., Fengdong Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8854/
> -----------------------------------------------------------
> 
> (Updated Jan. 7, 2013, 5:57 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> ExecSource has a default batchSize: 20, exec source read data from the source, then put it into the cache, after the cache is full, push it to the channel.
> 
> but if exec source's cache is not full, and there isn't any input for a long time, then these entries always kept in the cache, there is no chance to the channel until the source's cache is full.
> 
> so, the patch added a new config line: batchTimeout for ExecSource, and default is 3 seconds, if batchTimeout exceeded, push all cached data to the channel even the cache is not full.
> 
> 
> Diffs
> -----
> 
>   flume-ng-core/src/main/java/org/apache/flume/source/ExecSource.java 495b03f 
>   flume-ng-core/src/main/java/org/apache/flume/source/ExecSourceConfigurationConstants.java 1b35b01 
> 
> Diff: https://reviews.apache.org/r/8854/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Fengdong Yu
> 
>