You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@synapse.apache.org by "Hubert, Eric" <Er...@foxmobile.com> on 2009/12/05 15:21:34 UTC

Advice on choosing http core NIO version for Synapse 1.2

Hi all,

we are currently using Synapse 1.2 with http-core-nio 4.0 beta2. From time to time and unfortunately not reproducible we encounter IOReactor shutdowns in our production environment. I would like to track down the actual cause of these shutdowns, but the logs reveal no detail. 

I quickly went through the release notes of http core 4.0-beta3, 4.0 and 4.0.1. HTTPCORE-169 looked quite interesting. I remember that some time back we tried out 4.0 but stumbled across an issue related to chunked encoding. Unfortunately my memory has flashed. It could be that it was related to HTTPCORE-175.

Would it be advisable to check out updating to 4.0.1? Is there any chance to get some information about the actual cause in case of IOReactor terminations?


Thanks,
   Eric

Re: Advice on choosing http core NIO version for Synapse 1.2

Posted by Oleg Kalnichevski <ol...@apache.org>.

Hubert, Eric wrote:
> Oleg, thanks for your clarifications!
> 
>> If the reactor shutdown was requested in an orderly fashion there is a
>> pretty good chance it will shut down cleanly and there will be nothing
>> to look at the audit log. The audit log usually comes handy when the
>> reactor shuts down due to an unexpected fatal error.
> Hmmm, how can one know the difference?
> 
> 

An orderly I/O reactor shutdown must be explicitly initiated from 
application code. It is reasonable to assume it should be possible to 
tell whether a shutdown was unexpected or not based on the application 
internal state.

Oleg

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org

Re: Advice on choosing http core NIO version for Synapse 1.2

Posted by "Asankha C. Perera" <as...@apache.org>.

Hi Eric
>
> Asankha, I think found the related issue:
> https://issues.apache.org/jira/browse/SYNAPSE-600
>
>  
>
> According to your comment the fix has only been applied to the 1.3
> branch. Is there any reason to not (yet) apply this change to the trunk?
>
I think as with any release, we should merge back the branch to the
trunk, so this will be available soon. About the ClosedChannelException
being handled with the 4.0.1 release, I'm sorry I missed it.. so we
should be fine just upgrading the version of HttpCore

cheers
asankha

-- 
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com

RE: Advice on choosing http core NIO version for Synapse 1.2

Posted by "Hubert, Eric" <Er...@foxmobile.com>.

Asankha, I think found the related issue: https://issues.apache.org/jira/browse/SYNAPSE-600

According to your comment the fix has only been applied to the 1.3 branch. Is there any reason to not (yet) apply this change to the trunk?
________________________________
From: Hubert, Eric [mailto:Eric.Hubert@foxmobile.com]
Sent: Monday, December 07, 2009 7:44 AM
To: dev@synapse.apache.org
Subject: RE: Advice on choosing http core NIO version for Synapse 1.2

Hi Oleg, Eric

Me neither, but from what I've seen, in HttpCoreNIOSender Synapse logs the thrown IOReactorException including the cause information as early as it has a chance to do so:

Eric - you are looking at a previous version of the code, when we enabled auto-restart on unexpected shutdown, we improved this logic. However the real cause is the ClosedChannelException issue. We can neatly log fatal causes when such a restart is performed, so that we will know in future, what caused such a filure
Asankha, I actually was also a bit confused as I expected to find the code which attempts the restart. But if I’m not wrong I was really looking at trunk. Is it possible, that the change has only been applied to the Synapse 1.3 branch and not yet forward ported to the trunk?

Lets work on getting this fix into HttpComponents, probably as you switch to the new HttpCore version, since this is a quite common scenario you encounter in your deployment
Which fix are you referring to regarding HttpCompoents? Do you mean the one regarding the ClosedChannelException? This one has already been fixed for beta-3 and will be fine in 4.0.1. What other fix do you see which can be applied to HttpComponents directly? Sorry, maybe I’m a bit to tired this morning…

Thanks,
   Eric

RE: Advice on choosing http core NIO version for Synapse 1.2

Posted by "Hubert, Eric" <Er...@foxmobile.com>.

Hi Oleg, Eric

Me neither, but from what I've seen, in HttpCoreNIOSender Synapse logs the thrown IOReactorException including the cause information as early as it has a chance to do so:


Eric - you are looking at a previous version of the code, when we enabled auto-restart on unexpected shutdown, we improved this logic. However the real cause is the ClosedChannelException issue. We can neatly log fatal causes when such a restart is performed, so that we will know in future, what caused such a filure

Asankha, I actually was also a bit confused as I expected to find the code which attempts the restart. But if I’m not wrong I was really looking at trunk. Is it possible, that the change has only been applied to the Synapse 1.3 branch and not yet forward ported to the trunk?


Lets work on getting this fix into HttpComponents, probably as you switch to the new HttpCore version, since this is a quite common scenario you encounter in your deployment

Which fix are you referring to regarding HttpCompoents? Do you mean the one regarding the ClosedChannelException? This one has already been fixed for beta-3 and will be fine in 4.0.1. What other fix do you see which can be applied to HttpComponents directly? Sorry, maybe I’m a bit to tired this morning…


Thanks,
   Eric

Re: Advice on choosing http core NIO version for Synapse 1.2

Posted by "Asankha C. Perera" <as...@apache.org>.

Hi Oleg, Eric
> Me neither, but from what I've seen, in HttpCoreNIOSender Synapse logs the thrown IOReactorException including the cause information as early as it has a chance to do so:
>   
Eric - you are looking at a previous version of the code, when we
enabled auto-restart on unexpected shutdown, we improved this logic.
However the real cause is the ClosedChannelException issue. We can
neatly log fatal causes when such a restart is performed, so that we
will know in future, what caused such a filure
>>> Was it correct to terminate the Reactor?
>>>       
>> No, it was not. ClosedChannelException should not be considered fatal.
>>     
> This was also my understanding.
>
>   
>> It is a shame the exception is unchecked, though.
>>     
> Agreed.
>   

Lets work on getting this fix into HttpComponents, probably as you
switch to the new HttpCore version, since this is a quite common
scenario you encounter in your deployment

cheers
asankha

-- 
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com

RE: Advice on choosing http core NIO version for Synapse 1.2

Posted by "Hubert, Eric" <Er...@foxmobile.com>.

Oleg, thanks for your clarifications!

> If the reactor shutdown was requested in an orderly fashion there is a
> pretty good chance it will shut down cleanly and there will be nothing
> to look at the audit log. The audit log usually comes handy when the
> reactor shuts down due to an unexpected fatal error.
Hmmm, how can one know the difference?


> HttpCore does not do logging by itself. 
Ah, I think this perfectly explains it.

> I do not know Synapse that well to answer this question.
Me neither, but from what I've seen, in HttpCoreNIOSender Synapse logs the thrown IOReactorException including the cause information as early as it has a chance to do so:

Thread t = new Thread(new Runnable() {
    public void run() {
        try {
            ioReactor.execute(ioEventDispatch);
        } catch (InterruptedIOException ex) {
            log.fatal("Reactor Interrupted");
        } catch (IOException e) {
            log.fatal("Encountered an I/O error: " + e.getMessage(), e);
        }
     }
 }, "HttpCoreNIOSender");

So before this pops up, AbstractMultiworkerIOReactor performs the IOReactor shutdown in it's finally block, which obviously may take a while.
This should explain the behaviour I observed.


> > Was in this case the java.nio.channels.ClosedChannelException the cause
> of the IOReactor shutdown?
> Yes, it was.
Thanks for confirming this.


> > Was it correct to terminate the Reactor?
> No, it was not. ClosedChannelException should not be considered fatal.
This was also my understanding.

> It is a shame the exception is unchecked, though.
Agreed.


> > If some code within the core nio library decides to throw an
> IOReactorException could it make sense to log the cause immediately or are
> there cases where the code at higher levels may still consider it
> otherwise.
> 
> IOReactorException is considered fatal and should be propagated to the
> I/O reactor thread unless discarded by the IOReactorExceptionHandler
Ok, I now much better understand the expected flow. I was simply not aware, that http core nio is not logging at all and delegates this to the client using the library. This has some consequences to the order in which one can expect error messages. Good to know.


> > Anyway, if I'm not wrong and all of the above makes at least partially
> sense, than it looks to me that the fix in
> http://issues.apache.org/jira/browse/HTTPCORE-169
> > which went in beta-3 should have prevented the IOReactor restart.
> >
> 
> Correct. This is basically the same bug.
Thanks, Oleg you helped me a lot with your explanations. This is also just another proof that it is a good idea for us to update to 4.0.1.

Thanks,
   Eric

Re: Advice on choosing http core NIO version for Synapse 1.2

Posted by Oleg Kalnichevski <ol...@apache.org>.

Hubert, Eric wrote:
>> I cannot think of any good reason to continue using HttpCore older than
>> 4.0.1. Upgrade would be quite recommended.
> Oleg, thanks a lot for your quick response and suggestion. I think we will update our integration system to 4.0.1 first, and see how it runs there.
> 
>  
>> The process of an I/O reactor shutdown is quite complex and therefore
>> lots of things can go sideways if a particular I/O worker terminates
>> abnormally due to a runtime or a fatal I/O error. To help deal with
>> postmortem analysis I/O reactors maintain so called exception log that
>> contains all exceptions thrown in the process of reactor shutdown
>> including the one that triggered it.
>>
>> http://hc.apache.org/httpcomponents-core/tutorial/html/nio.html#d0e1287
>> Whenever an I/O reactor terminates it is advisable to examine the audit
>> log and if it contains any entries print them out to the application log.
> 
> Maybe I have to have a closer look at the code level for both Synapse and http Core to get a better understanding... So please excuse, if my following thoughts are incorrect and rather correct them.
> 
> Don't we generally have to differentiate between 
> 1) situations the code using the http nio library initiates/requests a IOReactor shutdown and 
> 2) those, where the http nio library does this on its own based on an IOReactorException? 
> 
> For the first case I now understand, that one should always check the audit log in order to also find out if there have been exceptions during the shutdown process. AFAIK Synapse currently does not follow this suggestion.
>

If the reactor shutdown was requested in an orderly fashion there is a 
pretty good chance it will shut down cleanly and there will be nothing 
to look at the audit log. The audit log usually comes handy when the 
reactor shuts down due to an unexpected fatal error.


> For the second case I understood that the client code has a chance to intercept the exception handling by providing an IOReactorExceptionHandler which can attempt to decide which exceptions shall be considered recoverable and which not. This is the place where the client can/should also log the cause of both IOExceptions and RuntimeExceptions. For Synapse this is the case. Anyway it looks like there are situations where IOReactorExceptions are thrown at other places without a chance for the client to "intercept". In this cases the client code finds out that the IOReactor has been shutdown or is in the process of being shutdown (IllegalStateException while calling connect()) and can just attempt to restart it.
> 

This is one possibility. Alternatively the reactor thread can intercept 
the fatal exception that caused the shutdown, optionally examine the 
audit log for additional information, and based on the result of the 
analysis choose to restart the reactor. HttpCoreNIOListener and 
HttpCoreNIOSender classes would be the ones to look at

http://svn.apache.org/repos/asf/synapse/trunk/java/modules/transports/core/nhttp/src/main/java/org/apache/synapse/transport/nhttp/HttpCoreNIOListener.java
http://svn.apache.org/repos/asf/synapse/trunk/java/modules/transports/core/nhttp/src/main/java/org/apache/synapse/transport/nhttp/HttpCoreNIOSender.java

> Is it possible, that the actual cause of an IOReactor shutdown will be logged a couple of time after it actually occurred?
>

HttpCore does not do logging by itself. I do not know Synapse that well 
to answer this question.


> Or to be more concrete, first error output:
> ERROR 2009-12-02 10:50:34,802 [HttpClientWorker-97][][] org.apache.synapse.core.axis2.Axis2Sender 'Unexpected error during sending message out'
> java.lang.IllegalStateException: I/O reactor has been shut down
>         at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.connect(DefaultConnectingIOReactor.java:176)
>         at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.sendAsyncRequest(HttpCoreNIOSender.java:392)
> 
> [...  a few 100 log lines later ...]
> 
> FATAL 2009-12-02 10:50:35,031 [HttpCoreNIOSender][][] org.apache.synapse.transport.nhttp.HttpCoreNIOSender 'Encountered an I/O error: Failure registering channel with the selector'
> org.apache.http.nio.reactor.IOReactorException: Failure registering channel with the selector
>         at org.apache.http.impl.nio.reactor.AbstractIOReactor.processNewChannels(AbstractIOReactor.java:220)
>         at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:153)
>         at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:70)
>         at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:324)
>         at java.lang.Thread.run(Thread.java:595)
> Caused by: java.nio.channels.ClosedChannelException
>         at java.nio.channels.spi.AbstractSelectableChannel.configureBlocking(AbstractSelectableChannel.java:252)
>         at org.apache.http.impl.nio.reactor.AbstractIOReactor.processNewChannels(AbstractIOReactor.java:217)
>         ... 4 more
> 
> Was in this case the java.nio.channels.ClosedChannelException the cause of the IOReactor shutdown?

Yes, it was.

> Was it correct to terminate the Reactor? 

No, it was not. ClosedChannelException should not be considered fatal. 
It is a shame the exception is unchecked, though.

> If some code within the core nio library decides to throw an IOReactorException could it make sense to log the cause immediately or are there cases where the code at higher levels may still consider it otherwise.

IOReactorException is considered fatal and should be propagated to the 
I/O reactor thread unless discarded by the IOReactorExceptionHandler

> 
> Anyway, if I'm not wrong and all of the above makes at least partially sense, than it looks to me that the fix in http://issues.apache.org/jira/browse/HTTPCORE-169
> which went in beta-3 should have prevented the IOReactor restart.
> 

Correct. This is basically the same bug.

Oleg

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org

RE: Advice on choosing http core NIO version for Synapse 1.2

Posted by "Hubert, Eric" <Er...@foxmobile.com>.

> I cannot think of any good reason to continue using HttpCore older than
> 4.0.1. Upgrade would be quite recommended.
Oleg, thanks a lot for your quick response and suggestion. I think we will update our integration system to 4.0.1 first, and see how it runs there.

 
> The process of an I/O reactor shutdown is quite complex and therefore
> lots of things can go sideways if a particular I/O worker terminates
> abnormally due to a runtime or a fatal I/O error. To help deal with
> postmortem analysis I/O reactors maintain so called exception log that
> contains all exceptions thrown in the process of reactor shutdown
> including the one that triggered it.
> 
> http://hc.apache.org/httpcomponents-core/tutorial/html/nio.html#d0e1287
> Whenever an I/O reactor terminates it is advisable to examine the audit
> log and if it contains any entries print them out to the application log.

Maybe I have to have a closer look at the code level for both Synapse and http Core to get a better understanding... So please excuse, if my following thoughts are incorrect and rather correct them.

Don't we generally have to differentiate between 
1) situations the code using the http nio library initiates/requests a IOReactor shutdown and 
2) those, where the http nio library does this on its own based on an IOReactorException? 

For the first case I now understand, that one should always check the audit log in order to also find out if there have been exceptions during the shutdown process. AFAIK Synapse currently does not follow this suggestion.

For the second case I understood that the client code has a chance to intercept the exception handling by providing an IOReactorExceptionHandler which can attempt to decide which exceptions shall be considered recoverable and which not. This is the place where the client can/should also log the cause of both IOExceptions and RuntimeExceptions. For Synapse this is the case. Anyway it looks like there are situations where IOReactorExceptions are thrown at other places without a chance for the client to "intercept". In this cases the client code finds out that the IOReactor has been shutdown or is in the process of being shutdown (IllegalStateException while calling connect()) and can just attempt to restart it.

Is it possible, that the actual cause of an IOReactor shutdown will be logged a couple of time after it actually occurred?

Or to be more concrete, first error output:
ERROR 2009-12-02 10:50:34,802 [HttpClientWorker-97][][] org.apache.synapse.core.axis2.Axis2Sender 'Unexpected error during sending message out'
java.lang.IllegalStateException: I/O reactor has been shut down
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.connect(DefaultConnectingIOReactor.java:176)
        at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.sendAsyncRequest(HttpCoreNIOSender.java:392)

[...  a few 100 log lines later ...]

FATAL 2009-12-02 10:50:35,031 [HttpCoreNIOSender][][] org.apache.synapse.transport.nhttp.HttpCoreNIOSender 'Encountered an I/O error: Failure registering channel with the selector'
org.apache.http.nio.reactor.IOReactorException: Failure registering channel with the selector
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processNewChannels(AbstractIOReactor.java:220)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:153)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:70)
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:324)
        at java.lang.Thread.run(Thread.java:595)
Caused by: java.nio.channels.ClosedChannelException
        at java.nio.channels.spi.AbstractSelectableChannel.configureBlocking(AbstractSelectableChannel.java:252)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processNewChannels(AbstractIOReactor.java:217)
        ... 4 more

Was in this case the java.nio.channels.ClosedChannelException the cause of the IOReactor shutdown? Was it correct to terminate the Reactor? If some code within the core nio library decides to throw an IOReactorException could it make sense to log the cause immediately or are there cases where the code at higher levels may still consider it otherwise.

Anyway, if I'm not wrong and all of the above makes at least partially sense, than it looks to me that the fix in http://issues.apache.org/jira/browse/HTTPCORE-169
which went in beta-3 should have prevented the IOReactor restart.

Oleg, please correct me if my understanding is wrong. I'm not very familiar with the code.

Regards,
   Eric

Re: Advice on choosing http core NIO version for Synapse 1.2

Posted by Ruwan Linton <ru...@gmail.com>.

Oleg,

I think this will be very useful in general for synapse. Is this logging
behaviour on the synapse trunk??

Thanks,
Ruwan

On Sat, Dec 5, 2009 at 8:28 PM, Oleg Kalnichevski <ol...@apache.org> wrote:

> Hubert, Eric wrote:
>
>> Hi all,
>>
>> we are currently using Synapse 1.2 with http-core-nio 4.0 beta2. From time
>> to time and unfortunately not reproducible we encounter IOReactor shutdowns
>> in our production environment. I would like to track down the actual cause
>> of these shutdowns, but the logs reveal no detail.
>> I quickly went through the release notes of http core 4.0-beta3, 4.0 and
>> 4.0.1. HTTPCORE-169 looked quite interesting. I remember that some time back
>> we tried out 4.0 but stumbled across an issue related to chunked encoding.
>> Unfortunately my memory has flashed. It could be that it was related to
>> HTTPCORE-175.
>>
>> Would it be advisable to check out updating to 4.0.1? Is there any chance
>> to get some information about the actual cause in case of IOReactor
>> terminations?
>>
>>
>> Thanks,
>>   Eric
>>
>
>
> Hi Eric
>
> I cannot think of any good reason to continue using HttpCore older than
> 4.0.1. Upgrade would be quite recommended.
>
> The process of an I/O reactor shutdown is quite complex and therefore lots
> of things can go sideways if a particular I/O worker terminates abnormally
> due to a runtime or a fatal I/O error. To help deal with postmortem analysis
> I/O reactors maintain so called exception log that contains all exceptions
> thrown in the process of reactor shutdown including the one that triggered
> it.
>
> http://hc.apache.org/httpcomponents-core/tutorial/html/nio.html#d0e1287
>
> Whenever an I/O reactor terminates it is advisable to examine the audit log
> and if it contains any entries print them out to the application log.
>
> Hope this helps
>
> Oleg
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>


-- 
Ruwan Linton
Technical Lead & Product Manager; WSO2 ESB; http://wso2.org/esb
WSO2 Inc.; http://wso2.org
email: ruwan@wso2.com; cell: +94 77 341 3097
blog: http://ruwansblog.blogspot.com

Re: Advice on choosing http core NIO version for Synapse 1.2

Posted by Oleg Kalnichevski <ol...@apache.org>.

Hubert, Eric wrote:
> Hi all,
> 
> we are currently using Synapse 1.2 with http-core-nio 4.0 beta2. From time to time and unfortunately not reproducible we encounter IOReactor shutdowns in our production environment. I would like to track down the actual cause of these shutdowns, but the logs reveal no detail. 
> 
> I quickly went through the release notes of http core 4.0-beta3, 4.0 and 4.0.1. HTTPCORE-169 looked quite interesting. I remember that some time back we tried out 4.0 but stumbled across an issue related to chunked encoding. Unfortunately my memory has flashed. It could be that it was related to HTTPCORE-175.
> 
> Would it be advisable to check out updating to 4.0.1? Is there any chance to get some information about the actual cause in case of IOReactor terminations?
> 
> 
> Thanks,
>    Eric

Hi Eric

I cannot think of any good reason to continue using HttpCore older than 
4.0.1. Upgrade would be quite recommended.

The process of an I/O reactor shutdown is quite complex and therefore 
lots of things can go sideways if a particular I/O worker terminates 
abnormally due to a runtime or a fatal I/O error. To help deal with 
postmortem analysis I/O reactors maintain so called exception log that 
contains all exceptions thrown in the process of reactor shutdown 
including the one that triggered it.

http://hc.apache.org/httpcomponents-core/tutorial/html/nio.html#d0e1287

Whenever an I/O reactor terminates it is advisable to examine the audit 
log and if it contains any entries print them out to the application log.

Hope this helps

Oleg

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org