You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Alan Conway <ac...@redhat.com> on 2008/08/18 19:19:16 UTC

Re: Reliability and the AMQP spec.

I'm teasing out the reliability guarantees offered in different
situations, some things are becoming clear and I want to sound them out.
This stuff would be good in user manual or training material...

The exactly-once guarantee is only valid if and only if neither the
broker nor the client ever crash. By crash I mean a session is
interrupted in a non-orderly fashion and is never successfully resumed.

We will eventually implement the broker half of that guarantee but the
client half is quite heavy going: the client must reliably maintain its
AMQP state persistently or be some kind of cluster in its own right.
Currently this isn't even possible with Qpid as we don't provide access
to the session state in a form that could be used for recovering a
session in a different process.

So if we lower our sights: what guarantees can we offer to
non-bulletproof clients? It turns out the two weaker guarantess
correspond to AMQP protocol options: it would be good to get the spec
clarified a little to make this obvious:

At least once: accept-mode=explicit:
 - broker MUST NOT forget a message till it receives an accept from the
client.
 - client MUST process the message transfer on receipt and MUST send an
accept only after processing.
 - risk: if client A crashes, other clients may receive messages already
processed by A.

At most once: accept-mode=none:
 - broker MUST forget messages upon sending the transfer
 - client MUST NOT process message on receiving the transfer. It MUST
send an accept and wait for the accept to complete before processing.
 - risk: if client A crashes, messages that it has not yet processed may
be lost.


The first case is straightforward in Qpid, but the second case is not.
The trivial implementation (send a synchronous accept for each incoming
message) will perform dreadfully. Is there a case for extending the API
or adding utility classes for this use case? E.g. with a variation of
MessageListener that passes you the messages when accepts complete
rather than when the transfer arrives?


Re: Reliability and the AMQP spec.

Posted by William Henry <wh...@redhat.com>.
Alan Conway wrote:
> I'm teasing out the reliability guarantees offered in different
> situations, some things are becoming clear and I want to sound them out.
> This stuff would be good in user manual or training material...
>
> The exactly-once guarantee is only valid if and only if neither the
> broker nor the client ever crash. By crash I mean a session is
> interrupted in a non-orderly fashion and is never successfully resumed.
>
> We will eventually implement the broker half of that guarantee but the
> client half is quite heavy going: the client must reliably maintain its
> AMQP state persistently or be some kind of cluster in its own right.
> Currently this isn't even possible with Qpid as we don't provide access
> to the session state in a form that could be used for recovering a
> session in a different process.
>
> So if we lower our sights: what guarantees can we offer to
> non-bulletproof clients? It turns out the two weaker guarantess
> correspond to AMQP protocol options: it would be good to get the spec
> clarified a little to make this obvious:
>
> At least once: accept-mode=explicit:
>  - broker MUST NOT forget a message till it receives an accept from the
> client.
>  - client MUST process the message transfer on receipt and MUST send an
> accept only after processing.
>  - risk: if client A crashes, other clients may receive messages already
> processed by A.
>
> At most once: accept-mode=none:
>  - broker MUST forget messages upon sending the transfer
>  - client MUST NOT process message on receiving the transfer. It MUST
> send an accept and wait for the accept to complete before processing.
>  - risk: if client A crashes, messages that it has not yet processed may
> be lost.
>
>
> The first case is straightforward in Qpid, but the second case is not.
> The trivial implementation (send a synchronous accept for each incoming
> message) will perform dreadfully. Is there a case for extending the API
> or adding utility classes for this use case? E.g. with a variation of
> MessageListener that passes you the messages when accepts complete
> rather than when the transfer arrives?
>
>   
Regarding last paragraph. Can we do something using negative acks 
instead?  I.e. the client only sends an acknowledgement if it sees that 
it has missed a message by using a message sequence number. And asks for 
a resend.


William



Re: Reliability and the AMQP spec.

Posted by Gordon Sim <gs...@redhat.com>.
Alan Conway wrote:
> On Tue, 2008-08-19 at 08:20 +0100, Gordon Sim wrote:
>> I was trying to describe an approach an application might use to get 
>> exactly-once processing guarantees; an accept would be used to prevent 
>> loss of messages, and completion of that accept might need to be tracked 
>> to update the state used to detect duplicates.
> 
> I don't think this is possible with unreliable clients and without using
> application-specific knowledge to screen duplicates, but I'd love to be
> proved wrong again :)

Agreed; that was exactly the case I was referring to.

> The problem I see: if you a) process the message before sending accept,
> you can get a duplicate as above. If you b) defer processing till after
> the accept completes you can lose a message if the client crashes
> between sending accept and processing the message. So if you do b) the
> client has to be made reliable in some way (logging the message before
> sending accept etc.) or if you do a) there has to be some
> application-specific way to screen duplicates that might be delivered to
> another client/a subsequent incarnation of this client.

Right; in general that might be through a set of sequence numbers for 
different publishing streams the application is aware of. Thats easy to 
maintain. If however it needs per message state recorded in order to 
detect duplicates, then that state needs cleared up and completion of 
the accept would be the point at which you could do that. (Not something 
I'm suggesting we offer right now though).

Re: Reliability and the AMQP spec.

Posted by Alan Conway <ac...@redhat.com>.
On Tue, 2008-08-19 at 08:20 +0100, Gordon Sim wrote:
> Then I am confused: above I thought we had now agreed that at-most-once 
> guarantees could be obtained by not requiring an accept at all?
Yes I have been confusing the issue by mixing singleton and cluster
problems in my mind. Let me try to get this straight, from the clients
perspective where we treat the broker as a singleton, no cluster issues:

at-most-once: use accept-mode=none, broker forgets message as soon as it
is sent, client processes message as soon as it is received. No accepts.
Messages can be lost if the client crashes. Easy.

at-least-once: use accept-mode=explicit. Client processes message
immediately and then sends accept. Message can be double-processed if
client crashes before sending accept.

So you're right Gordon, both of these cases can be handled in a
straightforward way with the existing API.

> I was trying to describe an approach an application might use to get 
> exactly-once processing guarantees; an accept would be used to prevent 
> loss of messages, and completion of that accept might need to be tracked 
> to update the state used to detect duplicates.

I don't think this is possible with unreliable clients and without using
application-specific knowledge to screen duplicates, but I'd love to be
proved wrong again :)

The problem I see: if you a) process the message before sending accept,
you can get a duplicate as above. If you b) defer processing till after
the accept completes you can lose a message if the client crashes
between sending accept and processing the message. So if you do b) the
client has to be made reliable in some way (logging the message before
sending accept etc.) or if you do a) there has to be some
application-specific way to screen duplicates that might be delivered to
another client/a subsequent incarnation of this client.



Re: Reliability and the AMQP spec.

Posted by Gordon Sim <gs...@redhat.com>.
Alan Conway wrote:
> On Mon, 2008-08-18 at 19:15 +0100, Gordon Sim wrote:
>> Alan Conway wrote:
>>> At most once: accept-mode=none:
>>>  - broker MUST forget messages upon sending the transfer
>>>  - client MUST NOT process message on receiving the transfer. It MUST
>>> send an accept and wait for the accept to complete before processing.
>> The client never needs to send an accept in this case; it can process 
>> the message as soon as it receives it.
> Yes, I think that was a cut-paste error in my mail.
> 
>>>  - risk: if client A crashes, messages that it has not yet processed may
>>> be lost.
>>>
>>>
>>> The first case is straightforward in Qpid, but the second case is not.
>>> The trivial implementation (send a synchronous accept for each incoming
>>> message) will perform dreadfully. 
>> You only need to wait for completion of the accept if you are updating 
>> some application state based on the assumption that the accepted message 
>> will never be redelivered.
> 
> Yes, that's what I mean by at-most-once delivery.

Then I am confused: above I thought we had now agreed that at-most-once 
guarantees could be obtained by not requiring an accept at all?

I was trying to describe an approach an application might use to get 
exactly-once processing guarantees; an accept would be used to prevent 
loss of messages, and completion of that accept might need to be tracked 
to update the state used to detect duplicates.

>>> Is there a case for extending the API
>>> or adding utility classes for this use case? E.g. with a variation of
>>> MessageListener that passes you the messages when accepts complete
>>> rather than when the transfer arrives?
>> I don't really understand why you would want to do that.
> 
> So that you can process messages with an at-most-once guarantee. The
> current API makes it tricky to do this without creating synchronous
> blocks that are a performance problem.

Again, I thought at-most-once came from accept-mode=none, so there would 
be _no_ accepts, synchronous or otherwise.

As above, I was referring to an application enforced exactly-once 
guarantee using some sort of idempotence barrier that requires 
updating/clean-up when the accept completes.

>> (I can see some benefit in being able to track completion 
>> asynchronously; I can also see some benefit in extending AckPolicy to do 
>> synchronous accepts if desired.)
> 
> Processing messages based on async completion of accept is exactly what
> I'm proposing above. Are you thinking of a more generic way of tracking
> arbitrary commands? That would work too provided there is an easy way to
> figure out the set of accepted messages when the accept completes.

In the scenario I can envisage (as above, an application level 
exactly-once policy), the message itself would be processed before 
sending the accept (because we don't want lost messages). The 
application would track completion of the accept so we could clean up 
whatever state was needed to detect duplicate delivery of that message.

Of course ideally it would just use one or more sequence counter for 
detecting duplicates which would likely remove the need to track 
completion of the accept. In other words while I can see value in 
providing the mechanisms needed to do this sort of thing but I suspect 
many applications will not require it, its a fairly specialised use-case.

Re: Reliability and the AMQP spec.

Posted by Alan Conway <ac...@redhat.com>.
On Mon, 2008-08-18 at 19:15 +0100, Gordon Sim wrote:
> Alan Conway wrote:
> > At most once: accept-mode=none:
> >  - broker MUST forget messages upon sending the transfer
> >  - client MUST NOT process message on receiving the transfer. It MUST
> > send an accept and wait for the accept to complete before processing.
> 
> The client never needs to send an accept in this case; it can process 
> the message as soon as it receives it.
Yes, I think that was a cut-paste error in my mail.

> >  - risk: if client A crashes, messages that it has not yet processed may
> > be lost.
> > 
> > 
> > The first case is straightforward in Qpid, but the second case is not.
> > The trivial implementation (send a synchronous accept for each incoming
> > message) will perform dreadfully. 
> 
> You only need to wait for completion of the accept if you are updating 
> some application state based on the assumption that the accepted message 
> will never be redelivered.

Yes, that's what I mean by at-most-once delivery.

> > Is there a case for extending the API
> > or adding utility classes for this use case? E.g. with a variation of
> > MessageListener that passes you the messages when accepts complete
> > rather than when the transfer arrives?
> 
> I don't really understand why you would want to do that.

So that you can process messages with an at-most-once guarantee. The
current API makes it tricky to do this without creating synchronous
blocks that are a performance problem.

> (I can see some benefit in being able to track completion 
> asynchronously; I can also see some benefit in extending AckPolicy to do 
> synchronous accepts if desired.)

Processing messages based on async completion of accept is exactly what
I'm proposing above. Are you thinking of a more generic way of tracking
arbitrary commands? That would work too provided there is an easy way to
figure out the set of accepted messages when the accept completes.


Re: Reliability and the AMQP spec.

Posted by Gordon Sim <gs...@redhat.com>.
Alan Conway wrote:
> At most once: accept-mode=none:
>  - broker MUST forget messages upon sending the transfer
>  - client MUST NOT process message on receiving the transfer. It MUST
> send an accept and wait for the accept to complete before processing.

The client never needs to send an accept in this case; it can process 
the message as soon as it receives it.

>  - risk: if client A crashes, messages that it has not yet processed may
> be lost.
> 
> 
> The first case is straightforward in Qpid, but the second case is not.
> The trivial implementation (send a synchronous accept for each incoming
> message) will perform dreadfully. 

You only need to wait for completion of the accept if you are updating 
some application state based on the assumption that the accepted message 
will never be redelivered.

> Is there a case for extending the API
> or adding utility classes for this use case? E.g. with a variation of
> MessageListener that passes you the messages when accepts complete
> rather than when the transfer arrives?

I don't really understand why you would want to do that.

(I can see some benefit in being able to track completion 
asynchronously; I can also see some benefit in extending AckPolicy to do 
synchronous accepts if desired.)