You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@qpid.apache.org by Fraser Adams <fr...@blueyonder.co.uk> on 2014/09/03 21:05:28 UTC

proton Messenger error handling/recovery

Hello,
I've probably missed something, but I don't know how to reliably detect 
failures and reconnect.

So if I sent to an address with a freshly stood up Messenger instance 
and the address can't be found things aren't too bad and I wind up with 
an ECONNREFUSED that I could do something with, however if I've been 
sending messages to a valid address then I kill off the consumer I see a:

[0x513380]:ERROR amqp:connection:framing-error connection aborted
[0x513380]:ERROR[-2] connection aborted

CONNECTION ERROR connection aborted (remote)

The thing is that all of these are *internally* generated messages sent 
to the console via fprintf, so my *application* doesn't really know 
about them (though I could be crafty and interpose my own cheeky fprintf 
to intercept them). That doesn't quite sound like the desired behaviour 
for a robust system?


Similarly should I actually trap an error what's the correct way to 
continue, as it happens currently my app carries on silently doing 
nothing useful and continuing to do so even when the peer restarts (so 
there is no magic internal reconnection logic as far as I can see).

do I have to do a
messenger.stop()
messenger.start()

cycle to get things going again, I'm guessing so, but I'll like to know 
what the "correct"/expected way to create Messenger code that is robust 
against remote failures, as far as I can see there are no examples of 
that sort of thing?

Cheers,
Frase

RE: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Ray Keating <ra...@pontimax.com>.

I'm *really* interested, Frase.

-Ray Keating, Pontimax Technologies LLC

-----Original Message-----
From: Fraser Adams [mailto:fraser.adams@blueyonder.co.uk] 
Sent: Monday, September 8, 2014 2:07 PM
To: users@qpid.apache.org
Subject: Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Messenger gurus seem to be keeping their heads down a bit.

Is it *really* just Alan and I who are interested to understand the error handling/reconnection behaviour of Messenger?

Is anybody using it in "industrial strength" applications or is it just being used in quick and dirty demos? Without error handling and reconnection mechanisms I'm struggling to see how it can be used for the former.

I can likely hack things and Alan also mentioned that he "cheats", but I'd really like to know from people who really understand messenger how to do it *properly*.

Frase


On 05/09/14 14:17, Alan Conway wrote:
> On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
>> On 03/09/14 23:29, Alan Conway wrote:
>>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
>>>> Hello,
>>>> I've probably missed something, but I don't know how to reliably 
>>>> detect failures and reconnect.
>>>>
>>>> So if I sent to an address with a freshly stood up Messenger 
>>>> instance and the address can't be found things aren't too bad and I 
>>>> wind up with an ECONNREFUSED that I could do something with, 
>>>> however if I've been sending messages to a valid address then I kill off the consumer I see a:
>>>>
>>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted 
>>>> [0x513380]:ERROR[-2] connection aborted
>>>>
>>>> CONNECTION ERROR connection aborted (remote)
>>>>
>>>> The thing is that all of these are *internally* generated messages 
>>>> sent to the console via fprintf, so my *application* doesn't really 
>>>> know about them (though I could be crafty and interpose my own 
>>>> cheeky fprintf to intercept them). That doesn't quite sound like 
>>>> the desired behaviour for a robust system?
>>>>
>>>>
>>>> Similarly should I actually trap an error what's the correct way to 
>>>> continue, as it happens currently my app carries on silently doing 
>>>> nothing useful and continuing to do so even when the peer restarts 
>>>> (so there is no magic internal reconnection logic as far as I can see).
>>>>
>>>> do I have to do a
>>>> messenger.stop()
>>>> messenger.start()
>>>>
>>>> cycle to get things going again, I'm guessing so, but I'll like to 
>>>> know what the "correct"/expected way to create Messenger code that 
>>>> is robust against remote failures, as far as I can see there are no 
>>>> examples of that sort of thing?
>>> I've come up against similar problems, I think it's an area that 
>>> needs some work in Proton. Is anybody already working on/thinking 
>>> about this area?
>>>
>>> Cheers,
>>> Alan.
>>>
>> I'd definitely like to know how others deal with this sort of thing.
> I cheat. I've been using proton in dispatch system tests, I come up 
> against these issues when I start up some proton/dispatch network and 
> try to use it too quickly before things have settled down. I have some 
> tweaks in my test harness to wait till things are ready so there are 
> no errors :) That's not a solution for general non-test situations - 
> although knowing how to wait till things are ready is always useful.
>
> https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test
> .py
>
> class Messenger adds a "flush" method that pumps the Messenger event 
> loop till there is no more work to do. Otherwise subscribe() in 
> particular gives no way to tell when the subscription is active.
>
> Note: My situation is a bit special in that dispatch creates addresses 
> dynamically on subscribe and my tests involve slow stuff like 
> waypoints to brokers etc. That introduces a delay in subscribe that 
> probably isn't visible when the address is created beforehand.
>
> There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for 
> qpidd and dispatch router to be ready respectively so I can be sure 
> that when I connect with proton they'll be listening. Those wait for 
> the expected listening ports to be connectable and in the case of 
> dispatch also does a qmf check to make sure that all expected outgoing connectors
> are there. 		
>
>> For info notwithstanding not necessarily being able to trap all the 
>> errors without being devious around fprintf  (which to be fair works, 
>> but it's a bit sneaky and if you have multiple Messenger instances 
>> won't tell you which one the error relates to) but when I do get an 
>> error I appear to have to start from scratch - in other words:
>>
>> message.free();
>> messenger.free();
>> message = new proton.Message();
>> messenger = new proton.Messenger();
>> messenger.start();
>>
>> If I try to restart the original messenger or use existing queue I 
>> get no joy. It's not the end of the world but I've no idea what 
>> robust Messenger code is *supposed* to look like.
>>
>> Presumably Alan and I aren't the only people who might like to be 
>> able to trap errors and restart? Or does every one else write code 
>> that never fails ;->
> I always wondered how everybody but me can do that. Sigh. For you and 
> me I think we need to do some work on proton's error handling.
>
> - proton (or any library!) should NEVER EVER write anything direct to 
> stdout or stderr. It needs a (very simple) logging facility that can 
> write to stderr by default but can be redirected elsewhere.
> - proton should never log an error without also returning some useful 
> error condition to the application.
>
> Proton has some useful pn_error_* functions, they just need to be used 
> more widely. In dispatch I introduced an errno-style thread-local 
> error code/message (in proton it would be a pn_error_t*) That allows 
> sensible error messages out of functions that want to return something else (e.g.
> pointer or null and set the thread error) It also allows you to work 
> around lazy error handling (temporarily of course (hahahaha)) - a 
> caller couple of stack frames up can detect an error even if 
> intermediate functions didn't check & propagate errors properly. I'm 
> not advocating lazy error checking but in C it is hard to get everything.
>
> FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does 
> proton already do things I've missed that would make this unnecessary?
>
> Cheers,
> Alan.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org For 
> additional commands, e-mail: users-help@qpid.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org For additional commands, e-mail: users-help@qpid.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Ted Ross <tr...@redhat.com>.


On 09/09/2014 10:59 AM, Alan Conway wrote:
> On Mon, 2014-09-08 at 19:07 +0100, Fraser Adams wrote:
>> Messenger gurus seem to be keeping their heads down a bit.
>>
>> Is it *really* just Alan and I who are interested to understand the 
>> error handling/reconnection behaviour of Messenger?
>>
>> Is anybody using it in "industrial strength" applications or is it just 
>> being used in quick and dirty demos? Without error handling and 
>> reconnection mechanisms I'm struggling to see how it can be used for the 
>> former.
>>
>> I can likely hack things and Alan also mentioned that he "cheats", but 
>> I'd really like to know from people who really understand messenger how 
>> to do it *properly*.
>>
> 
> I've been looking at this and error handling in Messenger is not just a
> matter of fixing implementation, there are some pretty big API questions
> to be answered about when and how you can report errors. Its not
> unfixable but I'm starting to think about moving away from Messenger and
> towards using the proton Engine API.
> 
> The original tradeoff was that engine is more complete and flexible but
> harder to use, whereas Messenger is easy but not as complete/flexible.
> However if you look at the toolkit & examples at
>  https://github.com/grs/examples

Gordon committed this content in a branch at:

https://svn.apache.org/repos/asf/qpid/proton/branches/examples/tutorial

> it makes engine a lot more appealing. The idea is to provide blocks of
> "normal default" behavior in a toolkit to get going quickly (and to keep
> you going for many/most uses) but allow those to be modified or replaced
> as things get more complex. The nice thing about this is that you know
> you can peel back the toolkit if you need to and get full access to the
> proton event machine, so anything proton knows you can react to.
> 
> If we can make the engine API approachable enough for general messaging
> use (while keeping it powerful enough for integration use) then it might
> make more sense to focus on doing that than on maintaining two different
> APIs for proton.
> 
> Cheers,
> Alan.
> 
>> Frase
>>
>>
>> On 05/09/14 14:17, Alan Conway wrote:
>>> On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
>>>> On 03/09/14 23:29, Alan Conway wrote:
>>>>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
>>>>>> Hello,
>>>>>> I've probably missed something, but I don't know how to reliably detect
>>>>>> failures and reconnect.
>>>>>>
>>>>>> So if I sent to an address with a freshly stood up Messenger instance
>>>>>> and the address can't be found things aren't too bad and I wind up with
>>>>>> an ECONNREFUSED that I could do something with, however if I've been
>>>>>> sending messages to a valid address then I kill off the consumer I see a:
>>>>>>
>>>>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted
>>>>>> [0x513380]:ERROR[-2] connection aborted
>>>>>>
>>>>>> CONNECTION ERROR connection aborted (remote)
>>>>>>
>>>>>> The thing is that all of these are *internally* generated messages sent
>>>>>> to the console via fprintf, so my *application* doesn't really know
>>>>>> about them (though I could be crafty and interpose my own cheeky fprintf
>>>>>> to intercept them). That doesn't quite sound like the desired behaviour
>>>>>> for a robust system?
>>>>>>
>>>>>>
>>>>>> Similarly should I actually trap an error what's the correct way to
>>>>>> continue, as it happens currently my app carries on silently doing
>>>>>> nothing useful and continuing to do so even when the peer restarts (so
>>>>>> there is no magic internal reconnection logic as far as I can see).
>>>>>>
>>>>>> do I have to do a
>>>>>> messenger.stop()
>>>>>> messenger.start()
>>>>>>
>>>>>> cycle to get things going again, I'm guessing so, but I'll like to know
>>>>>> what the "correct"/expected way to create Messenger code that is robust
>>>>>> against remote failures, as far as I can see there are no examples of
>>>>>> that sort of thing?
>>>>> I've come up against similar problems, I think it's an area that needs
>>>>> some work in Proton. Is anybody already working on/thinking about this
>>>>> area?
>>>>>
>>>>> Cheers,
>>>>> Alan.
>>>>>
>>>> I'd definitely like to know how others deal with this sort of thing.
>>> I cheat. I've been using proton in dispatch system tests, I come up
>>> against these issues when I start up some proton/dispatch network and
>>> try to use it too quickly before things have settled down. I have some
>>> tweaks in my test harness to wait till things are ready so there are no
>>> errors :) That's not a solution for general non-test situations -
>>> although knowing how to wait till things are ready is always useful.
>>>
>>> https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test.py
>>>
>>> class Messenger adds a "flush" method that pumps the Messenger event
>>> loop till there is no more work to do. Otherwise subscribe() in
>>> particular gives no way to tell when the subscription is active.
>>>
>>> Note: My situation is a bit special in that dispatch creates addresses
>>> dynamically on subscribe and my tests involve slow stuff like waypoints
>>> to brokers etc. That introduces a delay in subscribe that probably isn't
>>> visible when the address is created beforehand.
>>>
>>> There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for
>>> qpidd and dispatch router to be ready respectively so I can be sure that
>>> when I connect with proton they'll be listening. Those wait for the
>>> expected listening ports to be connectable and in the case of dispatch
>>> also does a qmf check to make sure that all expected outgoing connectors
>>> are there. 		
>>>
>>>> For info notwithstanding not necessarily being able to trap all the
>>>> errors without being devious around fprintf  (which to be fair works,
>>>> but it's a bit sneaky and if you have multiple Messenger instances won't
>>>> tell you which one the error relates to) but when I do get an error I
>>>> appear to have to start from scratch - in other words:
>>>>
>>>> message.free();
>>>> messenger.free();
>>>> message = new proton.Message();
>>>> messenger = new proton.Messenger();
>>>> messenger.start();
>>>>
>>>> If I try to restart the original messenger or use existing queue I get
>>>> no joy. It's not the end of the world but I've no idea what robust
>>>> Messenger code is *supposed* to look like.
>>>>
>>>> Presumably Alan and I aren't the only people who might like to be able
>>>> to trap errors and restart? Or does every one else write code that never
>>>> fails ;->
>>> I always wondered how everybody but me can do that. Sigh. For you and me
>>> I think we need to do some work on proton's error handling.
>>>
>>> - proton (or any library!) should NEVER EVER write anything direct to
>>> stdout or stderr. It needs a (very simple) logging facility that can
>>> write to stderr by default but can be redirected elsewhere.
>>> - proton should never log an error without also returning some useful
>>> error condition to the application.
>>>
>>> Proton has some useful pn_error_* functions, they just need to be used
>>> more widely. In dispatch I introduced an errno-style thread-local error
>>> code/message (in proton it would be a pn_error_t*) That allows sensible
>>> error messages out of functions that want to return something else (e.g.
>>> pointer or null and set the thread error) It also allows you to work
>>> around lazy error handling (temporarily of course (hahahaha)) - a caller
>>> couple of stack frames up can detect an error even if intermediate
>>> functions didn't check & propagate errors properly. I'm not advocating
>>> lazy error checking but in C it is hard to get everything.
>>>
>>> FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does proton
>>> already do things I've missed that would make this unnecessary?
>>>
>>> Cheers,
>>> Alan.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
>>> For additional commands, e-mail: users-help@qpid.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
>> For additional commands, e-mail: users-help@qpid.apache.org
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Gordon Sim <gs...@redhat.com>.

On 09/09/2014 03:59 PM, Alan Conway wrote:
> On Mon, 2014-09-08 at 19:07 +0100, Fraser Adams wrote:
>> Messenger gurus seem to be keeping their heads down a bit.
>>
>> Is it *really* just Alan and I who are interested to understand the
>> error handling/reconnection behaviour of Messenger?
>>
>> Is anybody using it in "industrial strength" applications or is it just
>> being used in quick and dirty demos? Without error handling and
>> reconnection mechanisms I'm struggling to see how it can be used for the
>> former.
>>
>> I can likely hack things and Alan also mentioned that he "cheats", but
>> I'd really like to know from people who really understand messenger how
>> to do it *properly*.
>>
>
> I've been looking at this and error handling in Messenger is not just a
> matter of fixing implementation, there are some pretty big API questions
> to be answered about when and how you can report errors. Its not
> unfixable but I'm starting to think about moving away from Messenger and
> towards using the proton Engine API.
>
> The original tradeoff was that engine is more complete and flexible but
> harder to use, whereas Messenger is easy but not as complete/flexible.
> However if you look at the toolkit & examples at
>   https://github.com/grs/examples
> it makes engine a lot more appealing.

This work has now been moved to a branch in proton's svn:

https://svn.apache.org/repos/asf/qpid/proton/branches/examples/tutorial/_build/html/tutorial.html

examples themselves are in:

https://svn.apache.org/repos/asf/qpid/proton/branches/examples/tutorial

I'm working on a slightly more involved example at the moment which I'll 
hopefully be in a position to check in before too long.

I should also point out again that these examples originated from Rafi's 
demo of the event oriented use of proton 
(https://github.com/rhs/qpid-proton-demo) which also show how features 
such as those offered by Messenger (routing, connection management) can 
be built as more composable utilities along the lines Alan describes below.

> The idea is to provide blocks of
> "normal default" behavior in a toolkit to get going quickly (and to keep
> you going for many/most uses) but allow those to be modified or replaced
> as things get more complex. The nice thing about this is that you know
> you can peel back the toolkit if you need to and get full access to the
> proton event machine, so anything proton knows you can react to.
>
> If we can make the engine API approachable enough for general messaging
> use (while keeping it powerful enough for integration use) then it might
> make more sense to focus on doing that than on maintaining two different
> APIs for proton.

I very much believe it would.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Alan Conway <ac...@redhat.com>.

On Mon, 2014-09-08 at 19:07 +0100, Fraser Adams wrote:
> Messenger gurus seem to be keeping their heads down a bit.
> 
> Is it *really* just Alan and I who are interested to understand the 
> error handling/reconnection behaviour of Messenger?
> 
> Is anybody using it in "industrial strength" applications or is it just 
> being used in quick and dirty demos? Without error handling and 
> reconnection mechanisms I'm struggling to see how it can be used for the 
> former.
> 
> I can likely hack things and Alan also mentioned that he "cheats", but 
> I'd really like to know from people who really understand messenger how 
> to do it *properly*.
> 

I've been looking at this and error handling in Messenger is not just a
matter of fixing implementation, there are some pretty big API questions
to be answered about when and how you can report errors. Its not
unfixable but I'm starting to think about moving away from Messenger and
towards using the proton Engine API.

The original tradeoff was that engine is more complete and flexible but
harder to use, whereas Messenger is easy but not as complete/flexible.
However if you look at the toolkit & examples at
 https://github.com/grs/examples
it makes engine a lot more appealing. The idea is to provide blocks of
"normal default" behavior in a toolkit to get going quickly (and to keep
you going for many/most uses) but allow those to be modified or replaced
as things get more complex. The nice thing about this is that you know
you can peel back the toolkit if you need to and get full access to the
proton event machine, so anything proton knows you can react to.

If we can make the engine API approachable enough for general messaging
use (while keeping it powerful enough for integration use) then it might
make more sense to focus on doing that than on maintaining two different
APIs for proton.

Cheers,
Alan.

> Frase
> 
> 
> On 05/09/14 14:17, Alan Conway wrote:
> > On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
> >> On 03/09/14 23:29, Alan Conway wrote:
> >>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
> >>>> Hello,
> >>>> I've probably missed something, but I don't know how to reliably detect
> >>>> failures and reconnect.
> >>>>
> >>>> So if I sent to an address with a freshly stood up Messenger instance
> >>>> and the address can't be found things aren't too bad and I wind up with
> >>>> an ECONNREFUSED that I could do something with, however if I've been
> >>>> sending messages to a valid address then I kill off the consumer I see a:
> >>>>
> >>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted
> >>>> [0x513380]:ERROR[-2] connection aborted
> >>>>
> >>>> CONNECTION ERROR connection aborted (remote)
> >>>>
> >>>> The thing is that all of these are *internally* generated messages sent
> >>>> to the console via fprintf, so my *application* doesn't really know
> >>>> about them (though I could be crafty and interpose my own cheeky fprintf
> >>>> to intercept them). That doesn't quite sound like the desired behaviour
> >>>> for a robust system?
> >>>>
> >>>>
> >>>> Similarly should I actually trap an error what's the correct way to
> >>>> continue, as it happens currently my app carries on silently doing
> >>>> nothing useful and continuing to do so even when the peer restarts (so
> >>>> there is no magic internal reconnection logic as far as I can see).
> >>>>
> >>>> do I have to do a
> >>>> messenger.stop()
> >>>> messenger.start()
> >>>>
> >>>> cycle to get things going again, I'm guessing so, but I'll like to know
> >>>> what the "correct"/expected way to create Messenger code that is robust
> >>>> against remote failures, as far as I can see there are no examples of
> >>>> that sort of thing?
> >>> I've come up against similar problems, I think it's an area that needs
> >>> some work in Proton. Is anybody already working on/thinking about this
> >>> area?
> >>>
> >>> Cheers,
> >>> Alan.
> >>>
> >> I'd definitely like to know how others deal with this sort of thing.
> > I cheat. I've been using proton in dispatch system tests, I come up
> > against these issues when I start up some proton/dispatch network and
> > try to use it too quickly before things have settled down. I have some
> > tweaks in my test harness to wait till things are ready so there are no
> > errors :) That's not a solution for general non-test situations -
> > although knowing how to wait till things are ready is always useful.
> >
> > https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test.py
> >
> > class Messenger adds a "flush" method that pumps the Messenger event
> > loop till there is no more work to do. Otherwise subscribe() in
> > particular gives no way to tell when the subscription is active.
> >
> > Note: My situation is a bit special in that dispatch creates addresses
> > dynamically on subscribe and my tests involve slow stuff like waypoints
> > to brokers etc. That introduces a delay in subscribe that probably isn't
> > visible when the address is created beforehand.
> >
> > There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for
> > qpidd and dispatch router to be ready respectively so I can be sure that
> > when I connect with proton they'll be listening. Those wait for the
> > expected listening ports to be connectable and in the case of dispatch
> > also does a qmf check to make sure that all expected outgoing connectors
> > are there. 		
> >
> >> For info notwithstanding not necessarily being able to trap all the
> >> errors without being devious around fprintf  (which to be fair works,
> >> but it's a bit sneaky and if you have multiple Messenger instances won't
> >> tell you which one the error relates to) but when I do get an error I
> >> appear to have to start from scratch - in other words:
> >>
> >> message.free();
> >> messenger.free();
> >> message = new proton.Message();
> >> messenger = new proton.Messenger();
> >> messenger.start();
> >>
> >> If I try to restart the original messenger or use existing queue I get
> >> no joy. It's not the end of the world but I've no idea what robust
> >> Messenger code is *supposed* to look like.
> >>
> >> Presumably Alan and I aren't the only people who might like to be able
> >> to trap errors and restart? Or does every one else write code that never
> >> fails ;->
> > I always wondered how everybody but me can do that. Sigh. For you and me
> > I think we need to do some work on proton's error handling.
> >
> > - proton (or any library!) should NEVER EVER write anything direct to
> > stdout or stderr. It needs a (very simple) logging facility that can
> > write to stderr by default but can be redirected elsewhere.
> > - proton should never log an error without also returning some useful
> > error condition to the application.
> >
> > Proton has some useful pn_error_* functions, they just need to be used
> > more widely. In dispatch I introduced an errno-style thread-local error
> > code/message (in proton it would be a pn_error_t*) That allows sensible
> > error messages out of functions that want to return something else (e.g.
> > pointer or null and set the thread error) It also allows you to work
> > around lazy error handling (temporarily of course (hahahaha)) - a caller
> > couple of stack frames up can detect an error even if intermediate
> > functions didn't check & propagate errors properly. I'm not advocating
> > lazy error checking but in C it is hard to get everything.
> >
> > FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does proton
> > already do things I've missed that would make this unnecessary?
> >
> > Cheers,
> > Alan.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > For additional commands, e-mail: users-help@qpid.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Marcel Meulemans <m....@tkhinnovations.com>.

On Tue, Sep 9, 2014 at 5:22 PM, Alan Conway <ac...@redhat.com> wrote:
> Given that engine is already a more complete and flexible API (in the
> sense of offering full low-level access to the entire AMQP protocol),
> and that people are demonstrating that it can be made easier to use by
> layering tools on top, perhaps that is where we should focus our efforts
> rather than splitting them over two APIs.

We are using proton in production environment. We started out using
the messenger API but at some point we were struggling with the fact
that there is no notion of a connection in this API. This is made
worse by the fact that the handling of connection failures (at least
proton-j, not sure about proton-c) is not working. For us the concept
of a connection is crucial in a communication API because drastic
measures (from a software point of view) need to be taken if the loss
of communication persists for to long a period of time. Therefore we
switched to the engine API and implement our own "messenger" which in
our experience was a friendly enough API and has given us what we
need.

So in that light I would support Alan's suggestion for focusing more
on the engine API (of course this is an easy statement for us because
we are not using the messenger API anymore). This would also maybe
give some more time to focus on existing connection related (and
other) bugs currently in the engine API such as
https://issues.apache.org/jira/browse/PROTON-644 ... :)

> > > >>>> I've probably missed something, but I don't know how to reliably detect
> > > >>>> failures and reconnect.

I don't think this is possible at the moment. At least in proton-j the
messenger will swallow exceptions thrown on channel.write()
(indication a connection failure), will not correctly handle errors
returned by channel.read() and support for detecting connection
failures through idle timeout is not implemented yet in the proton-j
engine API.

> > > > - proton (or any library!) should NEVER EVER write anything direct to
> > > > stdout or stderr. It needs a (very simple) logging facility that can
> > > > write to stderr by default but can be redirected elsewhere.
> > > > - proton should never log an error without also returning some useful
> > > > error condition to the application.

Agree, although these printfs are fairly scarse at the moment. I guess
for this the pn_transport_set_tracer idea (which works well) could be
extended.

-- 
Marcel

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Alan Conway <ac...@redhat.com>.

On Tue, 2014-09-09 at 08:34 -0400, Ken Giusti wrote:
> I'm also interested; personally never felt comfortable with the lack of visibility regarding things like connection failures that Messenger's api currently provides.
> 
> Tangentially related, perhaps - I'd like to see errors reported via the event collector interface.  While my issue is engine related, perhaps Messenger should provide applications access to the event "bus"?
> 
> I've opened a bug against the engine event model to include errors (at least for the transport/connection objects):
> 
> https://issues.apache.org/jira/browse/PROTON-656
> 

See my previous mail - I'm thinking of moving to engine and helping to
make that API easier to use rather putting the effort into Messenger.
See https://github.com/grs/examples/blob/master/proton_utils.py

The issue in Messenger is that it aims to hide connections and let the
user think about messages - that is a laudable goal. The problem is that
it still needs to report errors in terms that make sense to the API.
"connection X broke" doesn't make sense because messenger offers no
notion of connection X. Things that would make sense are "Message X will
never be delivered because of network problems" and "Subscription Y will
never receive messages because of network problems". The trick is where
and when to report these conditions. On message trackers? Exceptions on
put/send? Exceptions on get/recv? Some other event source?

Of course it would be great if Messenger did transparent failover for
you, and it can in the future - but you still need error notification.
Failover can itself fail - all nodes in the cluster are down, the local
network connection is dead etc.

None of these problems are unsolvable, but there's some work to be done.
Given that engine is already a more complete and flexible API (in the
sense of offering full low-level access to the entire AMQP protocol),
and that people are demonstrating that it can be made easier to use by
layering tools on top, perhaps that is where we should focus our efforts
rather than splitting them over two APIs.

> -K
> 
> ----- Original Message -----
> > From: "Fraser Adams" <fr...@blueyonder.co.uk>
> > To: users@qpid.apache.org
> > Sent: Monday, September 8, 2014 2:07:23 PM
> > Subject: Re: proton Messenger error handling/recovery REQUEST FEEDBACK!
> > 
> > Messenger gurus seem to be keeping their heads down a bit.
> > 
> > Is it *really* just Alan and I who are interested to understand the
> > error handling/reconnection behaviour of Messenger?
> > 
> > Is anybody using it in "industrial strength" applications or is it just
> > being used in quick and dirty demos? Without error handling and
> > reconnection mechanisms I'm struggling to see how it can be used for the
> > former.
> > 
> > I can likely hack things and Alan also mentioned that he "cheats", but
> > I'd really like to know from people who really understand messenger how
> > to do it *properly*.
> > 
> > Frase
> > 
> > 
> > On 05/09/14 14:17, Alan Conway wrote:
> > > On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
> > >> On 03/09/14 23:29, Alan Conway wrote:
> > >>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
> > >>>> Hello,
> > >>>> I've probably missed something, but I don't know how to reliably detect
> > >>>> failures and reconnect.
> > >>>>
> > >>>> So if I sent to an address with a freshly stood up Messenger instance
> > >>>> and the address can't be found things aren't too bad and I wind up with
> > >>>> an ECONNREFUSED that I could do something with, however if I've been
> > >>>> sending messages to a valid address then I kill off the consumer I see
> > >>>> a:
> > >>>>
> > >>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted
> > >>>> [0x513380]:ERROR[-2] connection aborted
> > >>>>
> > >>>> CONNECTION ERROR connection aborted (remote)
> > >>>>
> > >>>> The thing is that all of these are *internally* generated messages sent
> > >>>> to the console via fprintf, so my *application* doesn't really know
> > >>>> about them (though I could be crafty and interpose my own cheeky fprintf
> > >>>> to intercept them). That doesn't quite sound like the desired behaviour
> > >>>> for a robust system?
> > >>>>
> > >>>>
> > >>>> Similarly should I actually trap an error what's the correct way to
> > >>>> continue, as it happens currently my app carries on silently doing
> > >>>> nothing useful and continuing to do so even when the peer restarts (so
> > >>>> there is no magic internal reconnection logic as far as I can see).
> > >>>>
> > >>>> do I have to do a
> > >>>> messenger.stop()
> > >>>> messenger.start()
> > >>>>
> > >>>> cycle to get things going again, I'm guessing so, but I'll like to know
> > >>>> what the "correct"/expected way to create Messenger code that is robust
> > >>>> against remote failures, as far as I can see there are no examples of
> > >>>> that sort of thing?
> > >>> I've come up against similar problems, I think it's an area that needs
> > >>> some work in Proton. Is anybody already working on/thinking about this
> > >>> area?
> > >>>
> > >>> Cheers,
> > >>> Alan.
> > >>>
> > >> I'd definitely like to know how others deal with this sort of thing.
> > > I cheat. I've been using proton in dispatch system tests, I come up
> > > against these issues when I start up some proton/dispatch network and
> > > try to use it too quickly before things have settled down. I have some
> > > tweaks in my test harness to wait till things are ready so there are no
> > > errors :) That's not a solution for general non-test situations -
> > > although knowing how to wait till things are ready is always useful.
> > >
> > > https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test.py
> > >
> > > class Messenger adds a "flush" method that pumps the Messenger event
> > > loop till there is no more work to do. Otherwise subscribe() in
> > > particular gives no way to tell when the subscription is active.
> > >
> > > Note: My situation is a bit special in that dispatch creates addresses
> > > dynamically on subscribe and my tests involve slow stuff like waypoints
> > > to brokers etc. That introduces a delay in subscribe that probably isn't
> > > visible when the address is created beforehand.
> > >
> > > There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for
> > > qpidd and dispatch router to be ready respectively so I can be sure that
> > > when I connect with proton they'll be listening. Those wait for the
> > > expected listening ports to be connectable and in the case of dispatch
> > > also does a qmf check to make sure that all expected outgoing connectors
> > > are there.
> > >
> > >> For info notwithstanding not necessarily being able to trap all the
> > >> errors without being devious around fprintf  (which to be fair works,
> > >> but it's a bit sneaky and if you have multiple Messenger instances won't
> > >> tell you which one the error relates to) but when I do get an error I
> > >> appear to have to start from scratch - in other words:
> > >>
> > >> message.free();
> > >> messenger.free();
> > >> message = new proton.Message();
> > >> messenger = new proton.Messenger();
> > >> messenger.start();
> > >>
> > >> If I try to restart the original messenger or use existing queue I get
> > >> no joy. It's not the end of the world but I've no idea what robust
> > >> Messenger code is *supposed* to look like.
> > >>
> > >> Presumably Alan and I aren't the only people who might like to be able
> > >> to trap errors and restart? Or does every one else write code that never
> > >> fails ;->
> > > I always wondered how everybody but me can do that. Sigh. For you and me
> > > I think we need to do some work on proton's error handling.
> > >
> > > - proton (or any library!) should NEVER EVER write anything direct to
> > > stdout or stderr. It needs a (very simple) logging facility that can
> > > write to stderr by default but can be redirected elsewhere.
> > > - proton should never log an error without also returning some useful
> > > error condition to the application.
> > >
> > > Proton has some useful pn_error_* functions, they just need to be used
> > > more widely. In dispatch I introduced an errno-style thread-local error
> > > code/message (in proton it would be a pn_error_t*) That allows sensible
> > > error messages out of functions that want to return something else (e.g.
> > > pointer or null and set the thread error) It also allows you to work
> > > around lazy error handling (temporarily of course (hahahaha)) - a caller
> > > couple of stack frames up can detect an error even if intermediate
> > > functions didn't check & propagate errors properly. I'm not advocating
> > > lazy error checking but in C it is hard to get everything.
> > >
> > > FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does proton
> > > already do things I've missed that would make this unnecessary?
> > >
> > > Cheers,
> > > Alan.
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > For additional commands, e-mail: users-help@qpid.apache.org
> > >
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > For additional commands, e-mail: users-help@qpid.apache.org
> > 
> > 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Ken Giusti <kg...@redhat.com>.

I'm also interested; personally never felt comfortable with the lack of visibility regarding things like connection failures that Messenger's api currently provides.

Tangentially related, perhaps - I'd like to see errors reported via the event collector interface.  While my issue is engine related, perhaps Messenger should provide applications access to the event "bus"?

I've opened a bug against the engine event model to include errors (at least for the transport/connection objects):

https://issues.apache.org/jira/browse/PROTON-656


-K

----- Original Message -----
> From: "Fraser Adams" <fr...@blueyonder.co.uk>
> To: users@qpid.apache.org
> Sent: Monday, September 8, 2014 2:07:23 PM
> Subject: Re: proton Messenger error handling/recovery REQUEST FEEDBACK!
> 
> Messenger gurus seem to be keeping their heads down a bit.
> 
> Is it *really* just Alan and I who are interested to understand the
> error handling/reconnection behaviour of Messenger?
> 
> Is anybody using it in "industrial strength" applications or is it just
> being used in quick and dirty demos? Without error handling and
> reconnection mechanisms I'm struggling to see how it can be used for the
> former.
> 
> I can likely hack things and Alan also mentioned that he "cheats", but
> I'd really like to know from people who really understand messenger how
> to do it *properly*.
> 
> Frase
> 
> 
> On 05/09/14 14:17, Alan Conway wrote:
> > On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
> >> On 03/09/14 23:29, Alan Conway wrote:
> >>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
> >>>> Hello,
> >>>> I've probably missed something, but I don't know how to reliably detect
> >>>> failures and reconnect.
> >>>>
> >>>> So if I sent to an address with a freshly stood up Messenger instance
> >>>> and the address can't be found things aren't too bad and I wind up with
> >>>> an ECONNREFUSED that I could do something with, however if I've been
> >>>> sending messages to a valid address then I kill off the consumer I see
> >>>> a:
> >>>>
> >>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted
> >>>> [0x513380]:ERROR[-2] connection aborted
> >>>>
> >>>> CONNECTION ERROR connection aborted (remote)
> >>>>
> >>>> The thing is that all of these are *internally* generated messages sent
> >>>> to the console via fprintf, so my *application* doesn't really know
> >>>> about them (though I could be crafty and interpose my own cheeky fprintf
> >>>> to intercept them). That doesn't quite sound like the desired behaviour
> >>>> for a robust system?
> >>>>
> >>>>
> >>>> Similarly should I actually trap an error what's the correct way to
> >>>> continue, as it happens currently my app carries on silently doing
> >>>> nothing useful and continuing to do so even when the peer restarts (so
> >>>> there is no magic internal reconnection logic as far as I can see).
> >>>>
> >>>> do I have to do a
> >>>> messenger.stop()
> >>>> messenger.start()
> >>>>
> >>>> cycle to get things going again, I'm guessing so, but I'll like to know
> >>>> what the "correct"/expected way to create Messenger code that is robust
> >>>> against remote failures, as far as I can see there are no examples of
> >>>> that sort of thing?
> >>> I've come up against similar problems, I think it's an area that needs
> >>> some work in Proton. Is anybody already working on/thinking about this
> >>> area?
> >>>
> >>> Cheers,
> >>> Alan.
> >>>
> >> I'd definitely like to know how others deal with this sort of thing.
> > I cheat. I've been using proton in dispatch system tests, I come up
> > against these issues when I start up some proton/dispatch network and
> > try to use it too quickly before things have settled down. I have some
> > tweaks in my test harness to wait till things are ready so there are no
> > errors :) That's not a solution for general non-test situations -
> > although knowing how to wait till things are ready is always useful.
> >
> > https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test.py
> >
> > class Messenger adds a "flush" method that pumps the Messenger event
> > loop till there is no more work to do. Otherwise subscribe() in
> > particular gives no way to tell when the subscription is active.
> >
> > Note: My situation is a bit special in that dispatch creates addresses
> > dynamically on subscribe and my tests involve slow stuff like waypoints
> > to brokers etc. That introduces a delay in subscribe that probably isn't
> > visible when the address is created beforehand.
> >
> > There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for
> > qpidd and dispatch router to be ready respectively so I can be sure that
> > when I connect with proton they'll be listening. Those wait for the
> > expected listening ports to be connectable and in the case of dispatch
> > also does a qmf check to make sure that all expected outgoing connectors
> > are there.
> >
> >> For info notwithstanding not necessarily being able to trap all the
> >> errors without being devious around fprintf  (which to be fair works,
> >> but it's a bit sneaky and if you have multiple Messenger instances won't
> >> tell you which one the error relates to) but when I do get an error I
> >> appear to have to start from scratch - in other words:
> >>
> >> message.free();
> >> messenger.free();
> >> message = new proton.Message();
> >> messenger = new proton.Messenger();
> >> messenger.start();
> >>
> >> If I try to restart the original messenger or use existing queue I get
> >> no joy. It's not the end of the world but I've no idea what robust
> >> Messenger code is *supposed* to look like.
> >>
> >> Presumably Alan and I aren't the only people who might like to be able
> >> to trap errors and restart? Or does every one else write code that never
> >> fails ;->
> > I always wondered how everybody but me can do that. Sigh. For you and me
> > I think we need to do some work on proton's error handling.
> >
> > - proton (or any library!) should NEVER EVER write anything direct to
> > stdout or stderr. It needs a (very simple) logging facility that can
> > write to stderr by default but can be redirected elsewhere.
> > - proton should never log an error without also returning some useful
> > error condition to the application.
> >
> > Proton has some useful pn_error_* functions, they just need to be used
> > more widely. In dispatch I introduced an errno-style thread-local error
> > code/message (in proton it would be a pn_error_t*) That allows sensible
> > error messages out of functions that want to return something else (e.g.
> > pointer or null and set the thread error) It also allows you to work
> > around lazy error handling (temporarily of course (hahahaha)) - a caller
> > couple of stack frames up can detect an error even if intermediate
> > functions didn't check & propagate errors properly. I'm not advocating
> > lazy error checking but in C it is hard to get everything.
> >
> > FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does proton
> > already do things I've missed that would make this unnecessary?
> >
> > Cheers,
> > Alan.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > For additional commands, e-mail: users-help@qpid.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
> 
> 

-- 
-K

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Fraser Adams <fr...@blueyonder.co.uk>.

Messenger gurus seem to be keeping their heads down a bit.

Is it *really* just Alan and I who are interested to understand the 
error handling/reconnection behaviour of Messenger?

Is anybody using it in "industrial strength" applications or is it just 
being used in quick and dirty demos? Without error handling and 
reconnection mechanisms I'm struggling to see how it can be used for the 
former.

I can likely hack things and Alan also mentioned that he "cheats", but 
I'd really like to know from people who really understand messenger how 
to do it *properly*.

Frase


On 05/09/14 14:17, Alan Conway wrote:
> On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
>> On 03/09/14 23:29, Alan Conway wrote:
>>> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
>>>> Hello,
>>>> I've probably missed something, but I don't know how to reliably detect
>>>> failures and reconnect.
>>>>
>>>> So if I sent to an address with a freshly stood up Messenger instance
>>>> and the address can't be found things aren't too bad and I wind up with
>>>> an ECONNREFUSED that I could do something with, however if I've been
>>>> sending messages to a valid address then I kill off the consumer I see a:
>>>>
>>>> [0x513380]:ERROR amqp:connection:framing-error connection aborted
>>>> [0x513380]:ERROR[-2] connection aborted
>>>>
>>>> CONNECTION ERROR connection aborted (remote)
>>>>
>>>> The thing is that all of these are *internally* generated messages sent
>>>> to the console via fprintf, so my *application* doesn't really know
>>>> about them (though I could be crafty and interpose my own cheeky fprintf
>>>> to intercept them). That doesn't quite sound like the desired behaviour
>>>> for a robust system?
>>>>
>>>>
>>>> Similarly should I actually trap an error what's the correct way to
>>>> continue, as it happens currently my app carries on silently doing
>>>> nothing useful and continuing to do so even when the peer restarts (so
>>>> there is no magic internal reconnection logic as far as I can see).
>>>>
>>>> do I have to do a
>>>> messenger.stop()
>>>> messenger.start()
>>>>
>>>> cycle to get things going again, I'm guessing so, but I'll like to know
>>>> what the "correct"/expected way to create Messenger code that is robust
>>>> against remote failures, as far as I can see there are no examples of
>>>> that sort of thing?
>>> I've come up against similar problems, I think it's an area that needs
>>> some work in Proton. Is anybody already working on/thinking about this
>>> area?
>>>
>>> Cheers,
>>> Alan.
>>>
>> I'd definitely like to know how others deal with this sort of thing.
> I cheat. I've been using proton in dispatch system tests, I come up
> against these issues when I start up some proton/dispatch network and
> try to use it too quickly before things have settled down. I have some
> tweaks in my test harness to wait till things are ready so there are no
> errors :) That's not a solution for general non-test situations -
> although knowing how to wait till things are ready is always useful.
>
> https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test.py
>
> class Messenger adds a "flush" method that pumps the Messenger event
> loop till there is no more work to do. Otherwise subscribe() in
> particular gives no way to tell when the subscription is active.
>
> Note: My situation is a bit special in that dispatch creates addresses
> dynamically on subscribe and my tests involve slow stuff like waypoints
> to brokers etc. That introduces a delay in subscribe that probably isn't
> visible when the address is created beforehand.
>
> There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for
> qpidd and dispatch router to be ready respectively so I can be sure that
> when I connect with proton they'll be listening. Those wait for the
> expected listening ports to be connectable and in the case of dispatch
> also does a qmf check to make sure that all expected outgoing connectors
> are there. 		
>
>> For info notwithstanding not necessarily being able to trap all the
>> errors without being devious around fprintf  (which to be fair works,
>> but it's a bit sneaky and if you have multiple Messenger instances won't
>> tell you which one the error relates to) but when I do get an error I
>> appear to have to start from scratch - in other words:
>>
>> message.free();
>> messenger.free();
>> message = new proton.Message();
>> messenger = new proton.Messenger();
>> messenger.start();
>>
>> If I try to restart the original messenger or use existing queue I get
>> no joy. It's not the end of the world but I've no idea what robust
>> Messenger code is *supposed* to look like.
>>
>> Presumably Alan and I aren't the only people who might like to be able
>> to trap errors and restart? Or does every one else write code that never
>> fails ;->
> I always wondered how everybody but me can do that. Sigh. For you and me
> I think we need to do some work on proton's error handling.
>
> - proton (or any library!) should NEVER EVER write anything direct to
> stdout or stderr. It needs a (very simple) logging facility that can
> write to stderr by default but can be redirected elsewhere.
> - proton should never log an error without also returning some useful
> error condition to the application.
>
> Proton has some useful pn_error_* functions, they just need to be used
> more widely. In dispatch I introduced an errno-style thread-local error
> code/message (in proton it would be a pn_error_t*) That allows sensible
> error messages out of functions that want to return something else (e.g.
> pointer or null and set the thread error) It also allows you to work
> around lazy error handling (temporarily of course (hahahaha)) - a caller
> couple of stack frames up can detect an error even if intermediate
> functions didn't check & propagate errors properly. I'm not advocating
> lazy error checking but in C it is hard to get everything.
>
> FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does proton
> already do things I've missed that would make this unnecessary?
>
> Cheers,
> Alan.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery REQUEST FEEDBACK!

Posted by Alan Conway <ac...@redhat.com>.

On Thu, 2014-09-04 at 18:28 +0100, Fraser Adams wrote:
> On 03/09/14 23:29, Alan Conway wrote:
> > On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
> >> Hello,
> >> I've probably missed something, but I don't know how to reliably detect
> >> failures and reconnect.
> >>
> >> So if I sent to an address with a freshly stood up Messenger instance
> >> and the address can't be found things aren't too bad and I wind up with
> >> an ECONNREFUSED that I could do something with, however if I've been
> >> sending messages to a valid address then I kill off the consumer I see a:
> >>
> >> [0x513380]:ERROR amqp:connection:framing-error connection aborted
> >> [0x513380]:ERROR[-2] connection aborted
> >>
> >> CONNECTION ERROR connection aborted (remote)
> >>
> >> The thing is that all of these are *internally* generated messages sent
> >> to the console via fprintf, so my *application* doesn't really know
> >> about them (though I could be crafty and interpose my own cheeky fprintf
> >> to intercept them). That doesn't quite sound like the desired behaviour
> >> for a robust system?
> >>
> >>
> >> Similarly should I actually trap an error what's the correct way to
> >> continue, as it happens currently my app carries on silently doing
> >> nothing useful and continuing to do so even when the peer restarts (so
> >> there is no magic internal reconnection logic as far as I can see).
> >>
> >> do I have to do a
> >> messenger.stop()
> >> messenger.start()
> >>
> >> cycle to get things going again, I'm guessing so, but I'll like to know
> >> what the "correct"/expected way to create Messenger code that is robust
> >> against remote failures, as far as I can see there are no examples of
> >> that sort of thing?
> > I've come up against similar problems, I think it's an area that needs
> > some work in Proton. Is anybody already working on/thinking about this
> > area?
> >
> > Cheers,
> > Alan.
> >
> I'd definitely like to know how others deal with this sort of thing.

I cheat. I've been using proton in dispatch system tests, I come up
against these issues when I start up some proton/dispatch network and
try to use it too quickly before things have settled down. I have some
tweaks in my test harness to wait till things are ready so there are no
errors :) That's not a solution for general non-test situations -
although knowing how to wait till things are ready is always useful.

https://svn.apache.org/repos/asf/qpid/dispatch/trunk/tests/system_test.py

class Messenger adds a "flush" method that pumps the Messenger event
loop till there is no more work to do. Otherwise subscribe() in
particular gives no way to tell when the subscription is active.

Note: My situation is a bit special in that dispatch creates addresses
dynamically on subscribe and my tests involve slow stuff like waypoints
to brokers etc. That introduces a delay in subscribe that probably isn't
visible when the address is created beforehand. 

There's also Qpidd.wait_ready and Qdrouterd.wait_ready that wait for
qpidd and dispatch router to be ready respectively so I can be sure that
when I connect with proton they'll be listening. Those wait for the
expected listening ports to be connectable and in the case of dispatch
also does a qmf check to make sure that all expected outgoing connectors
are there. 		 

> 
> For info notwithstanding not necessarily being able to trap all the 
> errors without being devious around fprintf  (which to be fair works, 
> but it's a bit sneaky and if you have multiple Messenger instances won't 
> tell you which one the error relates to) but when I do get an error I 
> appear to have to start from scratch - in other words:
> 
> message.free();
> messenger.free();
> message = new proton.Message();
> messenger = new proton.Messenger();
> messenger.start();
> 
> If I try to restart the original messenger or use existing queue I get 
> no joy. It's not the end of the world but I've no idea what robust 
> Messenger code is *supposed* to look like.
> 
> Presumably Alan and I aren't the only people who might like to be able 
> to trap errors and restart? Or does every one else write code that never 
> fails ;->

I always wondered how everybody but me can do that. Sigh. For you and me
I think we need to do some work on proton's error handling. 

- proton (or any library!) should NEVER EVER write anything direct to
stdout or stderr. It needs a (very simple) logging facility that can
write to stderr by default but can be redirected elsewhere.
- proton should never log an error without also returning some useful
error condition to the application. 

Proton has some useful pn_error_* functions, they just need to be used
more widely. In dispatch I introduced an errno-style thread-local error
code/message (in proton it would be a pn_error_t*) That allows sensible
error messages out of functions that want to return something else (e.g.
pointer or null and set the thread error) It also allows you to work
around lazy error handling (temporarily of course (hahahaha)) - a caller
couple of stack frames up can detect an error even if intermediate
functions didn't check & propagate errors properly. I'm not advocating
lazy error checking but in C it is hard to get everything.

FEEDBACK PLEASE: anyone think this is a great/horrible idea? Does proton
already do things I've missed that would make this unnecessary?

Cheers,
Alan.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery

Posted by Fraser Adams <fr...@blueyonder.co.uk>.

On 03/09/14 23:29, Alan Conway wrote:
> On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
>> Hello,
>> I've probably missed something, but I don't know how to reliably detect
>> failures and reconnect.
>>
>> So if I sent to an address with a freshly stood up Messenger instance
>> and the address can't be found things aren't too bad and I wind up with
>> an ECONNREFUSED that I could do something with, however if I've been
>> sending messages to a valid address then I kill off the consumer I see a:
>>
>> [0x513380]:ERROR amqp:connection:framing-error connection aborted
>> [0x513380]:ERROR[-2] connection aborted
>>
>> CONNECTION ERROR connection aborted (remote)
>>
>> The thing is that all of these are *internally* generated messages sent
>> to the console via fprintf, so my *application* doesn't really know
>> about them (though I could be crafty and interpose my own cheeky fprintf
>> to intercept them). That doesn't quite sound like the desired behaviour
>> for a robust system?
>>
>>
>> Similarly should I actually trap an error what's the correct way to
>> continue, as it happens currently my app carries on silently doing
>> nothing useful and continuing to do so even when the peer restarts (so
>> there is no magic internal reconnection logic as far as I can see).
>>
>> do I have to do a
>> messenger.stop()
>> messenger.start()
>>
>> cycle to get things going again, I'm guessing so, but I'll like to know
>> what the "correct"/expected way to create Messenger code that is robust
>> against remote failures, as far as I can see there are no examples of
>> that sort of thing?
> I've come up against similar problems, I think it's an area that needs
> some work in Proton. Is anybody already working on/thinking about this
> area?
>
> Cheers,
> Alan.
>
I'd definitely like to know how others deal with this sort of thing.

For info notwithstanding not necessarily being able to trap all the 
errors without being devious around fprintf  (which to be fair works, 
but it's a bit sneaky and if you have multiple Messenger instances won't 
tell you which one the error relates to) but when I do get an error I 
appear to have to start from scratch - in other words:

message.free();
messenger.free();
message = new proton.Message();
messenger = new proton.Messenger();
messenger.start();

If I try to restart the original messenger or use existing queue I get 
no joy. It's not the end of the world but I've no idea what robust 
Messenger code is *supposed* to look like.

Presumably Alan and I aren't the only people who might like to be able 
to trap errors and restart? Or does every one else write code that never 
fails ;->

F.








---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: proton Messenger error handling/recovery

Posted by Alan Conway <ac...@redhat.com>.

On Wed, 2014-09-03 at 20:05 +0100, Fraser Adams wrote:
> Hello,
> I've probably missed something, but I don't know how to reliably detect 
> failures and reconnect.
> 
> So if I sent to an address with a freshly stood up Messenger instance 
> and the address can't be found things aren't too bad and I wind up with 
> an ECONNREFUSED that I could do something with, however if I've been 
> sending messages to a valid address then I kill off the consumer I see a:
> 
> [0x513380]:ERROR amqp:connection:framing-error connection aborted
> [0x513380]:ERROR[-2] connection aborted
> 
> CONNECTION ERROR connection aborted (remote)
> 
> The thing is that all of these are *internally* generated messages sent 
> to the console via fprintf, so my *application* doesn't really know 
> about them (though I could be crafty and interpose my own cheeky fprintf 
> to intercept them). That doesn't quite sound like the desired behaviour 
> for a robust system?
> 
> 
> Similarly should I actually trap an error what's the correct way to 
> continue, as it happens currently my app carries on silently doing 
> nothing useful and continuing to do so even when the peer restarts (so 
> there is no magic internal reconnection logic as far as I can see).
> 
> do I have to do a
> messenger.stop()
> messenger.start()
> 
> cycle to get things going again, I'm guessing so, but I'll like to know 
> what the "correct"/expected way to create Messenger code that is robust 
> against remote failures, as far as I can see there are no examples of 
> that sort of thing?

I've come up against similar problems, I think it's an area that needs
some work in Proton. Is anybody already working on/thinking about this
area?

Cheers,
Alan.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org