You are viewing a plain text version of this content. The canonical link for it is here.

Posted to proton@qpid.apache.org by Adrian Preston <PR...@uk.ibm.com> on 2015/04/23 18:17:00 UTC

Error handling in the proton-c reactor (and how it might relate to a Java port)

Hello all,

While porting the proton-c reactor to Java, I've found a few error paths that I wasn't sure how best to handle.

I have some ideas (see below), but if this stuff is already written down somewhere - feel free to suitably admonish me (and then point me towards it...)

1) When an error occurs while the reactor is servicing a connection: the connection is closed with a transport error. This is already implemented by various functions in reactor/connection.c (e.g. pni_handle_bound, to pick one at random), so I expect Java following suit shouldn't be too contentious.

2) When an error occurs while the reactor is accepting a connection: a PN_SELECTABLE_ERROR event is delivered to the acceptor's collector. This might necessitate a new pn_acceptor_attachments function to associate a handler with an acceptor (casting to selectable strikes me as something that might break in the future...). Aside: should it be possible to associate a pn_error (Java Throwable?) with an event, so that it is possible to report the underlying cause for a PN_SELECTABLE_ERROR?

3) In the Java reactor it is possible for an unchecked (derived from RuntimeException) exception to be thrown from a handler. Delivering a PN_SELECTABLE_ERROR to the selectable seems like the wrong thing to do (because the handler that threw the exception might not be associated with a selectable, or the exception could be thrown while handling PN_SELECTABLE_ERROR). Logging the exception then swallowing it seems likely to result in situations where the reactor appears to have hung. So the best I've come up with is that the Java equivalent of pn_reactor_process throws an exception - but then I'm not clear what state the reactor should be left in? Permanently failing, by throwing a "ReactorBorked" exception from any future pn_reactor_process invocation? Also, if this happens should the reactor be responsible for reclaiming the resources used by its children (e.g. closing their sockets)?

Thanks,
- Adrian

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Re: Error handling in the proton-c reactor (and how it might relate to a Java port)

Posted by Adrian Preston <PR...@uk.ibm.com>.

Thanks Rafael,

This is really helpful.  I'll have a look through the Python reactor code
(kicking myself for not thinking to do this earlier...) and try and get the
Java port to handle errors in a similar way.

Regards
- Adrian


On Fri, Apr 24, 2015 at 11:06 PM, Rafael Schloming <rh...@alum.mit.edu> wrote:

> Hi Adrian,
>
> See inline for answers...
>
> On Thu, Apr 23, 2015 at 12:17 PM, Adrian Preston <PR...@uk.ibm.com>
> wrote:
>
> > Hello all,
> >
> > While porting the proton-c reactor to Java, I've found a few error paths
> > that I wasn't sure how best to handle.
> >
> > I have some ideas (see below), but if this stuff is already written down
> > somewhere - feel free to suitably admonish me (and then point me towards
> > it...)
> >
> > 1) When an error occurs while the reactor is servicing a connection: the
> > connection is closed with a transport error.  This is already implemented
> > by various functions in reactor/connection.c (e.g. pni_handle_bound, to
> > pick one at random), so I expect Java following suit shouldn't be too
> > contentious.
> >
>
> Yes
>
>
> > 2) When an error occurs while the reactor is accepting a connection: a
> > PN_SELECTABLE_ERROR event is delivered to the acceptor's collector.  This
> > might necessitate a new pn_acceptor_attachments function to associate a
> > handler with an acceptor (casting to selectable strikes me as something
> > that might break in the future...).  Aside: should it be possible to
> > associate a pn_error (Java Throwable?) with an event, so that it is
> > possible to report the underlying cause for a PN_SELECTABLE_ERROR?
> >
>
> A pn_acceptor_attachments function makes sense to me.
>
> Regarding your other question. In general I've been trying to stick to
> having each event reference only a single object, and also reference state
> in the object model rather than carry state itself, so I might consider
> adding an accessor to pn_selectable_t to store/extract error information
> instead of storing it on the event.
>
> 3) In the Java reactor it is possible for an unchecked (derived from
> > RuntimeException) exception to be thrown from a handler.  Delivering a
> > PN_SELECTABLE_ERROR to the selectable seems like the wrong thing to do
> > (because the handler that threw the exception might not be associated
> with
> > a selectable, or the exception could be thrown while handling
> > PN_SELECTABLE_ERROR).  Logging the exception then swallowing it seems
> > likely to result in situations where the reactor appears to have hung.
> So
> > the best I've come up with is that the Java equivalent of
> > pn_reactor_process throws an exception - but then I'm not clear what
> state
> > the reactor should be left in?  Permanently failing, by throwing a
> > "ReactorBorked" exception from any future pn_reactor_process invocation?
> > Also, if this happens should the reactor be responsible for reclaiming
> the
> > resources used by its children (e.g. closing their sockets)?
> >
>
> The python wrapper of the reactor has a similar situation since python code
> can also throw runtime exceptions. From my experience coding against the
> API, you definitely want to know sooner rather than later exactly what has
> gone wrong. It can be easy to miss errors that scroll by in a log, so I
> would definitely not attempt to continue executing automatically. That said
> I would try not to leave the reactor in a permanently borked state either
> since you might want the option to fire events related to shutdown after an
> error.
>
> What I've done in python is roughly the following. I catch and save any
> exceptions that occur during dispatch of the current event to its handlers.
> When that event has been dispatched to all handlers, I throw an exception
> (it's anonymous currently, but it should probably be some sort of
> DispatchException) from Reactor.process() that references any exceptions
> that occurred during dispatch of that event. This by default results in the
> reactor failing fast when an exception occurs, but also leave things in a
> state where the user can easily log the exception and call process again if
> they wish to continue.
>
> Regarding reclaiming resources, I don't attempt to close sockets or
> anything like that since for my use cases when the reactor fails the whole
> process exits. In C this will happen when the reactor is freed, but
> obviously in python and/or java you would be depending on GC to make that
> happen and it might not be soon enough, so it may make sense to add a
> method that would explicitly do that sort of cleanup.
>
> --Rafael
> 

Regards
- Adrian

Adrian Preston, IBM WebSphere MQ Development - Hursley - MQ Light

-----Rafael Schloming <rh...@alum.mit.edu> wrote: -----
To: "proton@qpid.apache.org" <pr...@qpid.apache.org>
From: Rafael Schloming <rh...@alum.mit.edu>
Date: 04/24/2015 11:06AM
Subject: Re: Error handling in the proton-c reactor (and how it might relate to a Java port)

Hi Adrian,

See inline for answers...

On Thu, Apr 23, 2015 at 12:17 PM, Adrian Preston <PR...@uk.ibm.com>
wrote:

> Hello all,
>
> While porting the proton-c reactor to Java, I've found a few error paths
> that I wasn't sure how best to handle.
>
> I have some ideas (see below), but if this stuff is already written down
> somewhere - feel free to suitably admonish me (and then point me towards
> it...)
>
> 1) When an error occurs while the reactor is servicing a connection: the
> connection is closed with a transport error.  This is already implemented
> by various functions in reactor/connection.c (e.g. pni_handle_bound, to
> pick one at random), so I expect Java following suit shouldn't be too
> contentious.
>

Yes


> 2) When an error occurs while the reactor is accepting a connection: a
> PN_SELECTABLE_ERROR event is delivered to the acceptor's collector.  This
> might necessitate a new pn_acceptor_attachments function to associate a
> handler with an acceptor (casting to selectable strikes me as something
> that might break in the future...).  Aside: should it be possible to
> associate a pn_error (Java Throwable?) with an event, so that it is
> possible to report the underlying cause for a PN_SELECTABLE_ERROR?
>

A pn_acceptor_attachments function makes sense to me.

Regarding your other question. In general I've been trying to stick to
having each event reference only a single object, and also reference state
in the object model rather than carry state itself, so I might consider
adding an accessor to pn_selectable_t to store/extract error information
instead of storing it on the event.

3) In the Java reactor it is possible for an unchecked (derived from
> RuntimeException) exception to be thrown from a handler.  Delivering a
> PN_SELECTABLE_ERROR to the selectable seems like the wrong thing to do
> (because the handler that threw the exception might not be associated with
> a selectable, or the exception could be thrown while handling
> PN_SELECTABLE_ERROR).  Logging the exception then swallowing it seems
> likely to result in situations where the reactor appears to have hung.  So
> the best I've come up with is that the Java equivalent of
> pn_reactor_process throws an exception - but then I'm not clear what state
> the reactor should be left in?  Permanently failing, by throwing a
> "ReactorBorked" exception from any future pn_reactor_process invocation?
> Also, if this happens should the reactor be responsible for reclaiming the
> resources used by its children (e.g. closing their sockets)?
>

The python wrapper of the reactor has a similar situation since python code
can also throw runtime exceptions. From my experience coding against the
API, you definitely want to know sooner rather than later exactly what has
gone wrong. It can be easy to miss errors that scroll by in a log, so I
would definitely not attempt to continue executing automatically. That said
I would try not to leave the reactor in a permanently borked state either
since you might want the option to fire events related to shutdown after an
error.

What I've done in python is roughly the following. I catch and save any
exceptions that occur during dispatch of the current event to its handlers.
When that event has been dispatched to all handlers, I throw an exception
(it's anonymous currently, but it should probably be some sort of
DispatchException) from Reactor.process() that references any exceptions
that occurred during dispatch of that event. This by default results in the
reactor failing fast when an exception occurs, but also leave things in a
state where the user can easily log the exception and call process again if
they wish to continue.

Regarding reclaiming resources, I don't attempt to close sockets or
anything like that since for my use cases when the reactor fails the whole
process exits. In C this will happen when the reactor is freed, but
obviously in python and/or java you would be depending on GC to make that
happen and it might not be soon enough, so it may make sense to add a
method that would explicitly do that sort of cleanup.

--Rafael
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Re: Error handling in the proton-c reactor (and how it might relate to a Java port)

Posted by Rafael Schloming <rh...@alum.mit.edu>.

Hi Adrian,

See inline for answers...

On Thu, Apr 23, 2015 at 12:17 PM, Adrian Preston <PR...@uk.ibm.com>
wrote:

> Hello all,
>
> While porting the proton-c reactor to Java, I've found a few error paths
> that I wasn't sure how best to handle.
>
> I have some ideas (see below), but if this stuff is already written down
> somewhere - feel free to suitably admonish me (and then point me towards
> it...)
>
> 1) When an error occurs while the reactor is servicing a connection: the
> connection is closed with a transport error.  This is already implemented
> by various functions in reactor/connection.c (e.g. pni_handle_bound, to
> pick one at random), so I expect Java following suit shouldn't be too
> contentious.
>

Yes

> 2) When an error occurs while the reactor is accepting a connection: a
> PN_SELECTABLE_ERROR event is delivered to the acceptor's collector.  This
> might necessitate a new pn_acceptor_attachments function to associate a
> handler with an acceptor (casting to selectable strikes me as something
> that might break in the future...).  Aside: should it be possible to
> associate a pn_error (Java Throwable?) with an event, so that it is
> possible to report the underlying cause for a PN_SELECTABLE_ERROR?
>

A pn_acceptor_attachments function makes sense to me.

Regarding your other question. In general I've been trying to stick to
having each event reference only a single object, and also reference state
in the object model rather than carry state itself, so I might consider
adding an accessor to pn_selectable_t to store/extract error information
instead of storing it on the event.

3) In the Java reactor it is possible for an unchecked (derived from
> RuntimeException) exception to be thrown from a handler.  Delivering a
> PN_SELECTABLE_ERROR to the selectable seems like the wrong thing to do
> (because the handler that threw the exception might not be associated with
> a selectable, or the exception could be thrown while handling
> PN_SELECTABLE_ERROR).  Logging the exception then swallowing it seems
> likely to result in situations where the reactor appears to have hung.  So
> the best I've come up with is that the Java equivalent of
> pn_reactor_process throws an exception - but then I'm not clear what state
> the reactor should be left in?  Permanently failing, by throwing a
> "ReactorBorked" exception from any future pn_reactor_process invocation?
> Also, if this happens should the reactor be responsible for reclaiming the
> resources used by its children (e.g. closing their sockets)?
>

The python wrapper of the reactor has a similar situation since python code
can also throw runtime exceptions. From my experience coding against the
API, you definitely want to know sooner rather than later exactly what has
gone wrong. It can be easy to miss errors that scroll by in a log, so I
would definitely not attempt to continue executing automatically. That said
I would try not to leave the reactor in a permanently borked state either
since you might want the option to fire events related to shutdown after an
error.

What I've done in python is roughly the following. I catch and save any
exceptions that occur during dispatch of the current event to its handlers.
When that event has been dispatched to all handlers, I throw an exception
(it's anonymous currently, but it should probably be some sort of
DispatchException) from Reactor.process() that references any exceptions
that occurred during dispatch of that event. This by default results in the
reactor failing fast when an exception occurs, but also leave things in a
state where the user can easily log the exception and call process again if
they wish to continue.

Regarding reclaiming resources, I don't attempt to close sockets or
anything like that since for my use cases when the reactor fails the whole
process exits. In C this will happen when the reactor is freed, but
obviously in python and/or java you would be depending on GC to make that
happen and it might not be soon enough, so it may make sense to add a
method that would explicitly do that sort of cleanup.

--Rafael