You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Juhani Connolly <ju...@gmail.com> on 2012/03/22 09:23:00 UTC
Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/
-----------------------------------------------------------
Review request for Flume.
Summary
-------
As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
This addresses bug FLUME-1030.
https://issues.apache.org/jira/browse/FLUME-1030
Diffs
-----
flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
Diff: https://reviews.apache.org/r/4445/diff
Testing
-------
Modified the test for the new functionality, new test passes
No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
Thanks,
Juhani
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Juhani Connolly <ju...@gmail.com>.
> On 2012-03-22 17:07:01, Arvind Prabhakar wrote:
> > Thanks for the patch Juhani. I was able to run the tests successfully. I have some minor feedback below for your consideration.
thanks for running the tests. back to normal on my end too
> On 2012-03-22 17:07:01, Arvind Prabhakar wrote:
> > flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java, line 90
> > <https://reviews.apache.org/r/4445/diff/1/?file=94575#file94575line90>
> >
> > It will be good to cap this penalty amount to a predefined/configured ceiling value.
Added a config variable
> On 2012-03-22 17:07:01, Arvind Prabhakar wrote:
> > flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java, lines 120-121
> > <https://reviews.apache.org/r/4445/diff/1/?file=94575#file94575line120>
> >
> > There is one slight issue here though - which is if the channel is empty, the sink being attempted to recover will likely return BACKOFF, implying that the sink is normal and has recovered.
> >
> > A minor nit: it will be nice if the process invocation on the failed sink was from within the process() that calls the active Sink. That way the logic stays in one place.
I got rid of the queue subclass and put the code in process... Though I'm not sure if that is the easiest way for the human brain to parse it...
I also changed things so that a backoff results in being returned to the failed list without a penalty
- Juhani
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/#review6231
-----------------------------------------------------------
On 2012-03-22 08:23:00, Juhani Connolly wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4445/
> -----------------------------------------------------------
>
> (Updated 2012-03-22 08:23:00)
>
>
> Review request for Flume.
>
>
> Summary
> -------
>
> As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
> Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
>
>
> This addresses bug FLUME-1030.
> https://issues.apache.org/jira/browse/FLUME-1030
>
>
> Diffs
> -----
>
> flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
> flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
>
> Diff: https://reviews.apache.org/r/4445/diff
>
>
> Testing
> -------
>
> Modified the test for the new functionality, new test passes
>
> No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
>
>
> Thanks,
>
> Juhani
>
>
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Arvind Prabhakar <ar...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/#review6231
-----------------------------------------------------------
Thanks for the patch Juhani. I was able to run the tests successfully. I have some minor feedback below for your consideration.
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java
<https://reviews.apache.org/r/4445/#comment13432>
It will be good to cap this penalty amount to a predefined/configured ceiling value.
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java
<https://reviews.apache.org/r/4445/#comment13438>
There is one slight issue here though - which is if the channel is empty, the sink being attempted to recover will likely return BACKOFF, implying that the sink is normal and has recovered.
A minor nit: it will be nice if the process invocation on the failed sink was from within the process() that calls the active Sink. That way the logic stays in one place.
- Arvind
On 2012-03-22 08:23:00, Juhani Connolly wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4445/
> -----------------------------------------------------------
>
> (Updated 2012-03-22 08:23:00)
>
>
> Review request for Flume.
>
>
> Summary
> -------
>
> As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
> Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
>
>
> This addresses bug FLUME-1030.
> https://issues.apache.org/jira/browse/FLUME-1030
>
>
> Diffs
> -----
>
> flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
> flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
>
> Diff: https://reviews.apache.org/r/4445/diff
>
>
> Testing
> -------
>
> Modified the test for the new functionality, new test passes
>
> No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
>
>
> Thanks,
>
> Juhani
>
>
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Juhani Connolly <ju...@gmail.com>.
> On 2012-03-23 03:48:14, Arvind Prabhakar wrote:
> > flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java, line 193
> > <https://reviews.apache.org/r/4445/diff/2/?file=94789#file94789line193>
> >
> > Log the exception.
Did this, changed EventDeliveryException->Exception
> On 2012-03-23 03:48:14, Arvind Prabhakar wrote:
> > flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java, line 182
> > <https://reviews.apache.org/r/4445/diff/2/?file=94789#file94789line182>
> >
> > It will be good to log this exception so we have a trace of the failures that are happening.
Was just using it to guard against null and number exceptions at the same time. Separated it out and checked for null. Logging the Number exception because it's probably a typo in config
> On 2012-03-23 03:48:14, Arvind Prabhakar wrote:
> > flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java, line 138
> > <https://reviews.apache.org/r/4445/diff/2/?file=94789#file94789line138>
> >
> > This seems like a typo. Perhaps you want to do something like
> >
> > maxPenalty = context.getInteger(CONF_KEY_MAX_PENALTY, DEFAULT_MAX_PENALTY);
> >
> >
Yes. Probably not enough sleep. I checked in the debugger that things were getting correctly set/read now
> On 2012-03-23 03:48:14, Arvind Prabhakar wrote:
> > flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java, lines 94-95
> > <https://reviews.apache.org/r/4445/diff/2/?file=94789#file94789line94>
> >
> > The max penalty calculated during configure of the sink processor should be applied here to enforce the ceiling.
Done
On 2012-03-23 03:48:14, Juhani Connolly wrote:
> > Rest of the changes look good to me.
Wow, what a mess. Must've been tired or something.
The name of the max limit also had a period on its end that I removed.
- Juhani
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/#review6265
-----------------------------------------------------------
On 2012-03-23 02:41:03, Juhani Connolly wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4445/
> -----------------------------------------------------------
>
> (Updated 2012-03-23 02:41:03)
>
>
> Review request for Flume.
>
>
> Summary
> -------
>
> As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
> Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
>
>
> This addresses bug FLUME-1030.
> https://issues.apache.org/jira/browse/FLUME-1030
>
>
> Diffs
> -----
>
> flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
> flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
>
> Diff: https://reviews.apache.org/r/4445/diff
>
>
> Testing
> -------
>
> Modified the test for the new functionality, new test passes
>
> No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
>
>
> Thanks,
>
> Juhani
>
>
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Arvind Prabhakar <ar...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/#review6265
-----------------------------------------------------------
Thanks for making the changes Juhani. Some feedback follows.
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java
<https://reviews.apache.org/r/4445/#comment13571>
The max penalty calculated during configure of the sink processor should be applied here to enforce the ceiling.
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java
<https://reviews.apache.org/r/4445/#comment13570>
This seems like a typo. Perhaps you want to do something like
maxPenalty = context.getInteger(CONF_KEY_MAX_PENALTY, DEFAULT_MAX_PENALTY);
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java
<https://reviews.apache.org/r/4445/#comment13568>
It will be good to log this exception so we have a trace of the failures that are happening.
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java
<https://reviews.apache.org/r/4445/#comment13569>
Log the exception.
Rest of the changes look good to me.
- Arvind
On 2012-03-23 02:41:03, Juhani Connolly wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4445/
> -----------------------------------------------------------
>
> (Updated 2012-03-23 02:41:03)
>
>
> Review request for Flume.
>
>
> Summary
> -------
>
> As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
> Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
>
>
> This addresses bug FLUME-1030.
> https://issues.apache.org/jira/browse/FLUME-1030
>
>
> Diffs
> -----
>
> flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
> flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
>
> Diff: https://reviews.apache.org/r/4445/diff
>
>
> Testing
> -------
>
> Modified the test for the new functionality, new test passes
>
> No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
>
>
> Thanks,
>
> Juhani
>
>
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Arvind Prabhakar <ar...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/#review6289
-----------------------------------------------------------
Ship it!
+1
- Arvind
On 2012-03-23 05:28:45, Juhani Connolly wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4445/
> -----------------------------------------------------------
>
> (Updated 2012-03-23 05:28:45)
>
>
> Review request for Flume.
>
>
> Summary
> -------
>
> As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
> Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
>
>
> This addresses bug FLUME-1030.
> https://issues.apache.org/jira/browse/FLUME-1030
>
>
> Diffs
> -----
>
> flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
> flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
>
> Diff: https://reviews.apache.org/r/4445/diff
>
>
> Testing
> -------
>
> Modified the test for the new functionality, new test passes
>
> No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
>
>
> Thanks,
>
> Juhani
>
>
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Juhani Connolly <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/
-----------------------------------------------------------
(Updated 2012-03-23 05:28:45.159536)
Review request for Flume.
Changes
-------
Fixed suggested changes and also added some javadoc describing functioning and new config setting.
Summary
-------
As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
This addresses bug FLUME-1030.
https://issues.apache.org/jira/browse/FLUME-1030
Diffs (updated)
-----
flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
Diff: https://reviews.apache.org/r/4445/diff
Testing
-------
Modified the test for the new functionality, new test passes
No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
Thanks,
Juhani
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Juhani Connolly <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/
-----------------------------------------------------------
(Updated 2012-03-23 02:41:03.472482)
Review request for Flume.
Changes
-------
Wasn't applying the penalty limit, fixed now
Summary
-------
As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
This addresses bug FLUME-1030.
https://issues.apache.org/jira/browse/FLUME-1030
Diffs (updated)
-----
flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
Diff: https://reviews.apache.org/r/4445/diff
Testing
-------
Modified the test for the new functionality, new test passes
No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
Thanks,
Juhani
Re: Review Request: FLUME-1030,
handle sinks with exceptions by putting them on a cooldown
Posted by Juhani Connolly <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4445/
-----------------------------------------------------------
(Updated 2012-03-23 01:48:59.198037)
Review request for Flume.
Changes
-------
Updated with the suggested changes.
All tests pass
Summary
-------
As discussed in the JIRA item, I modified FailoverSink to deal with all exceptions.
Now a sink that fails will be put onto a failed links queue, from which a recovery will be attempted after a timeout. Each sequential failure the timeout will increase. I am open to other methods of increasing the timeout(maybe add on a ceiling?)
This addresses bug FLUME-1030.
https://issues.apache.org/jira/browse/FLUME-1030
Diffs (updated)
-----
flume-ng-core/src/test/java/org/apache/flume/sink/TestFailoverSinkProcessor.java 195c121
flume-ng-core/src/main/java/org/apache/flume/sink/FailoverSinkProcessor.java 7eada57
Diff: https://reviews.apache.org/r/4445/diff
Testing
-------
Modified the test for the new functionality, new test passes
No other tests should be affected, but my environment was having some weird problems. I'll look into them tomorrow, just leaving this up so people can have a browse and will confirm tests passing tomorrow
Thanks,
Juhani