You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Søren <sd...@syntonetic.com> on 2012/01/24 11:12:11 UTC

fix for hanging collector node

Dear flume community

We have encountered a serious problem in our collector node (flume 
0.9.4). After browsing the mail group it seems to be a well known issue:
..........
2012-01-23 15:26:31,611 INFO 
com.cloudera.flume.handlers.debug.StubbornAppendSink: append Interrupted 
event '[eventcontent]' with error: Blocked append interrupted by 
rotation event
2012-01-23 15:26:31,612 INFO 
com.cloudera.flume.handlers.rolling.RollSink: closing RollSink 
'escapedCustomDfs("s3n://flume-log/eventlog/","el-%{rolltag}" )'
2012-01-23 15:26:31,612 ERROR 
com.cloudera.flume.core.connector.DirectDriver: Closing down due to 
exception during append calls
2012-01-23 15:26:31,612 INFO 
com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode 
CollectorEventlog-22 exited with error: null
.........

We are using S3 sink which supports the theory that the the node is 
sensitive to timeout or similar from the sink.
The bug should be solved with:
+ patch FLUME-762
+ patch FLUME-798
+ setting flume.collector.roll.timeout to 0

Do I get it right so far?

We have installed flume via apt, and haven't dealed with other than the 
binary build of flume until now.
Are there updated binaries out there?
If not, what is the easiest and safest way to apply those patches?

Thanks for a great product anyway. We are looking forward to flume-ng 
with support for thrift source/S3 sink.

Thanks in advance
Søren


Re: fix for hanging collector node

Posted by Søren <sd...@syntonetic.com>.
Thanks Prasad

That update release is conveniently near by. We will go for that and 
take it from there.

Cheers
Søren

On 24/01/2012 19:25, Prasad Mujumdar wrote:
>
>     yes, looks like you are running into FLUME-798, so these two 
> patches and configured timeout property should resolve the problem.
> If you don't want to build the patch, then you can use flume from the 
> upcoming CDH3 update3 which should be available by end of the month.
>
> thanks
> Prasad
>
>
>
> On Tue, Jan 24, 2012 at 2:12 AM, Søren <sd@syntonetic.com 
> <ma...@syntonetic.com>> wrote:
>
>     Dear flume community
>
>     We have encountered a serious problem in our collector node (flume
>     0.9.4). After browsing the mail group it seems to be a well known
>     issue:
>     ..........
>     2012-01-23 15 <tel:2012-01-23%2015>:26:31,611 INFO
>     com.cloudera.flume.handlers.debug.StubbornAppendSink: append
>     Interrupted event '[eventcontent]' with error: Blocked append
>     interrupted by rotation event
>     2012-01-23 15 <tel:2012-01-23%2015>:26:31,612 INFO
>     com.cloudera.flume.handlers.rolling.RollSink: closing RollSink
>     'escapedCustomDfs("s3n://flume-log/eventlog/","el-%{rolltag}" )'
>     2012-01-23 15 <tel:2012-01-23%2015>:26:31,612 ERROR
>     com.cloudera.flume.core.connector.DirectDriver: Closing down due
>     to exception during append calls
>     2012-01-23 15 <tel:2012-01-23%2015>:26:31,612 INFO
>     com.cloudera.flume.core.connector.DirectDriver: Connector
>     logicalNode CollectorEventlog-22 exited with error: null
>     .........
>
>     We are using S3 sink which supports the theory that the the node
>     is sensitive to timeout or similar from the sink.
>     The bug should be solved with:
>     + patch FLUME-762
>     + patch FLUME-798
>     + setting flume.collector.roll.timeout to 0
>
>     Do I get it right so far?
>
>     We have installed flume via apt, and haven't dealed with other
>     than the binary build of flume until now.
>     Are there updated binaries out there?
>     If not, what is the easiest and safest way to apply those patches?
>
>     Thanks for a great product anyway. We are looking forward to
>     flume-ng with support for thrift source/S3 sink.
>
>     Thanks in advance
>     Søren
>
>

Re: fix for hanging collector node

Posted by Prasad Mujumdar <pr...@cloudera.com>.
    yes, looks like you are running into FLUME-798, so these two patches
and configured timeout property should resolve the problem.
If you don't want to build the patch, then you can use flume from the
upcoming CDH3 update3 which should be available by end of the month.

thanks
Prasad



On Tue, Jan 24, 2012 at 2:12 AM, Søren <sd...@syntonetic.com> wrote:

> Dear flume community
>
> We have encountered a serious problem in our collector node (flume 0.9.4).
> After browsing the mail group it seems to be a well known issue:
> ..........
> 2012-01-23 15:26:31,611 INFO com.cloudera.flume.handlers.**debug.StubbornAppendSink:
> append Interrupted event '[eventcontent]' with error: Blocked append
> interrupted by rotation event
> 2012-01-23 15:26:31,612 INFO com.cloudera.flume.handlers.**rolling.RollSink:
> closing RollSink 'escapedCustomDfs("s3n://**flume-log/eventlog/","el-%{**rolltag}"
> )'
> 2012-01-23 15:26:31,612 ERROR com.cloudera.flume.core.**connector.DirectDriver:
> Closing down due to exception during append calls
> 2012-01-23 15:26:31,612 INFO com.cloudera.flume.core.**connector.DirectDriver:
> Connector logicalNode CollectorEventlog-22 exited with error: null
> .........
>
> We are using S3 sink which supports the theory that the the node is
> sensitive to timeout or similar from the sink.
> The bug should be solved with:
> + patch FLUME-762
> + patch FLUME-798
> + setting flume.collector.roll.timeout to 0
>
> Do I get it right so far?
>
> We have installed flume via apt, and haven't dealed with other than the
> binary build of flume until now.
> Are there updated binaries out there?
> If not, what is the easiest and safest way to apply those patches?
>
> Thanks for a great product anyway. We are looking forward to flume-ng with
> support for thrift source/S3 sink.
>
> Thanks in advance
> Søren
>
>