You are viewing a plain text version of this content. The canonical link for it is here.
Posted to proton@qpid.apache.org by Phil Harvey <ph...@philharveyonline.com> on 2013/04/02 17:41:43 UTC

Re: Defining the behaviour of Proton Engine API under error conditions

Thanks for the response.  It does clarify the Engine's semantics, and the
intended division of responsibility between the Engine and the
application.  I intend to document this soon in a short conceptual summary
under proton/docs/engine/.

I chatted to Keith about this and we're uncertain about some of the details
of the steps that follow an invalid frame being pushed into the Engine. To
illustrate this, we wrote the following pseudo-code for the main loop of a
typical application (eg a messaging client), similar to Driver's
pn_connector_process function.

1   tail_buf = pn_transport_tail()
2   tail_capacity = pn_transport_capacity()
3   read = socket_recv(tail_buf, tail_capacity)
4   # ... [1]
5
6   push_err_no = pn_transport_push(read) # see [Q1]
7
8   if (push_err_no < 0)
9     socket_shutdown(SHUTDOWN_READ)
10  end if
11
12  # ... [2]
13
14  head_pending = pn_transport_pending() # see [Q2]
15  if(head_pending > 0)
16    head_buf = pn_transport_head()
17    written = socket_send(head_buf, head_pending)
18    # ... [3]
19
20    pn_transport_pop(written)
21  else if (head_pending < 0)
22    socket_shutdown(SHUTDOWN_WRITE)
23  end if

Elided sections:
[1] A well-behaved application would call pn_transport_close_tail() if
socket_recv() < 0
[2] Application makes use of top half API - pn_session_head(),
pn_work_head() etc
[3] A well-behaved application would call pn_transport_close_head() if
socket_send() < 0


=== Questions about error handling ===

Imagine that the bytes read from the socket on line 3 represent a valid
frame followed by a frame that is invalid (e.g. because it contains a field
of an unexpected datatype). In this case:

[Q1] Should pn_transport_push return -1 on line 6, thereby signalling that
the application can't push any more bytes into it?

[Q2] On lines 14-21, what is in the transport's outgoing byte queue? We
expect that it would be:
  <frame1 corresponding to top-half API calls on line 12>
  <frame2 ... >
  <...>
  <the CLOSE frame triggered by the invalid input>.

Or maybe the CLOSE frame somehow replaces the other outgoing frames in the
transport's outgoing byte queue?

Note: if the application supports failover, it would subsequently unbind
the transport, create a new socket, create a new transport, and bind the
existing connection to it.


Phil



On 28 March 2013 16:25, Rafael Schloming <rh...@alum.mit.edu> wrote:

> On Thu, Mar 28, 2013 at 11:16 AM, Phil Harvey <phil@philharveyonline.com
> >wrote:
>
> > On 28 March 2013 13:17, Rafael Schloming <rh...@alum.mit.edu> wrote:
> >
> > > On Thu, Mar 28, 2013 at 5:31 AM, Rob Godfrey <rob.j.godfrey@gmail.com
> > > >wrote:
> > >
> > > > On 28 March 2013 02:45, Rafael Schloming <rh...@alum.mit.edu> wrote:
> > > >
> > > > > On Wed, Mar 27, 2013 at 6:34 PM, Rob Godfrey <
> > rob.j.godfrey@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > On 27 March 2013 21:16, Rafael Schloming <rh...@alum.mit.edu>
> wrote:
> > > > > > > On Wed, Mar 27, 2013 at 11:53 AM, Keith W <
> keith.wall@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > >
> > > > > > [..snip..]
> > >
> >
> > [..snip..]
> >
> > >
> > > To answer your question, say there is a framing error/the-wire is cut
> > > (really there isn't any way to know the difference since you could cut
> > the
> > > wire half way through a frame header), the transport interface will
> write
> > > out the close frame as required by the spec, and it will indicate
> through
> > > it's error interface that an error has occurred, however it won't alter
> > any
> > > of the local/remote states of the top half endpoints. The local states
> > > remain reflective of the local apps desired state, and the remote
> states
> > > remain reflective of the remote apps last known desired state. This
> kind
> > of
> > > has to be this way because you don't want to confuse links being
> > > involuntarily detached because the wire was cut with the remote
> endpoint
> > > wanting to actively shut down the link.
> > >
> >
> > I find this a little confusing.  After the Transport has silently sent
> the
> > Close frame, what would the local Application typically do next in order
> to
> > get the Engine back to a usable state?
> >
>
> It would unbind the transport. When the transport is unbound, all the
> remote state is cleared, and the app is free to use the connection/endpoint
> data structure as if it had simply built them up into their current state
> explicitly via constructors as opposed to it being the result of network
> interactions.
>
>
> >
> > There is an alternative approach which I would find simpler.  In this
> > alternative. the Engine would not implicitly send the Close frame.
> > Instead, the Application would explicitly control this by doing the
> > following:
> >
> > - The Application checks the Transport's error state as usual
> > - The Application discovers that the Transport is in an error state and
> > therefore calls
> Connection.setCondition(errorDetailsObtainedFromTransport)
> > followed by Connection.close()
> > - The Application calls Transport.output (or pn_transport_head in
> > proton-c), causing the Close frame bytes to be produced.
> >
> > As a Proton Engine developer I would find this simpler to implement.
> >
>
> I'm not sure I follow why this would be any simpler to implement.
>
>
> >
> > Moreover, as a Proton Engine user, it gives me a clearer separation of
> > responsibility between the application and the Engine.
> >
> > Maybe I'm just not grokking the Engine's philosophy, but on the whole the
> > Engine API feels like it gives me control over what frames are being
> > produced (though not *when* - and that's fine by me), so I find the idea
> of
> > the Transport layer silently sending a Close rather surprising.
> >
>
> As a user of the top half, I don't think you should need to be at all aware
> of what frames are/aren't sent by the engine. You should only need to think
> about the semantics that the top half of the API provide. So I would ask
> you, why do you care what frames are/aren't sent by the engine? It seems to
> me that all you should care about are the intentions of the remote
> application, e.g. did the remote application intend to open/close a link,
> session, or connection.
>
> I think it is key to understand the dual nature of the open/close,
> begin/end, and attach/detach frames as defined by the protocol They are
> used to express two classes of things, i.e. they are frames that combine
> information from multiple distinct layers. They are used in a purely
> structural/framing/formatting level to manage on-the-wire constructs such
> as frame sizes, channel numbers, handles, etc. Concepts that never need to
> leak outside the transport. They are also used at a higher semantic level,
> e.g. to express an application's intent to establish a link. Not every use
> of one of these frames involves both elements, for example if a very small
> number of channels is negotiated by the transport, but the app wants to use
> a lot of sessions, then the engine implementation might well
> detach/reattach sessions on demand in order to share the limited number of
> available channels. This would never be visible to the top half, and many
> attach/detach frames would be sent without the application doing anything
> to trigger it.
>
> I think the particular case of a framing-error style close is somewhat
> similar. Mechanically/structurally the close frame is required to conform
> to the spec, but in such a case it is not expressing the intent of the
> *application* to close the connection, it is expressing the intent of the
> *transport* to close the connection.
>
>
> >
> > What are people's views on this?
> >
>
> The spec doesn't allow you to do anything other than send a close frame
> with an appropriate condition if a framing error occurs. It seems odd for
> us to ask the app developer to manually do this when the impl could just as
> well do it on its own. What you describe would require the app developer to
> always include extra boilerplate, and furthermore it puts the burden of
> maintaining spec compliance onto that extra boilerplate code. I would say
> anything that allows the app developer to make the engine violate the
> protocol spec is probably a bad thing, and anything requiring the app
> developer to be vigilant and follow good coding practices in order to avoid
> violating the spec is certainly a really really bad thing. ;-)
>
> --Rafael
>