You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@qpid.apache.org by Tobias Duckworth <to...@duckworth.uk.com> on 2017/03/07 16:51:45 UTC

pn_delivery_finalize Assertion failure

Hello,

I'm using qpid-proton 0-12.2 in 'connection_engine' mode.

Very occasionally I'm getting errors in the function pn_delivery_finalize
(engine.c) - I've been trying to work out what's going wrong here, but the
call pattern is pretty tricky to work out (partly due to the use of macros
to define functions on pn_object)

What I can concretely see sometimes when this goes wrong is either of the
following assertions triggering:

 pn_delivery_finalize: Assertion `pn_refcount(delivery) == 0' failed.
 pn_delivery_finalize: Assertion `pn_refcount(delivery) == 1' failed.

It seems like the function pn_delivery_finalize gets called multiple times,
and can end up recursively calling itself via some of the functions it calls
internally.

What I do notice every time that this happens is that delivery->state.init
is false, whereas when it doesn't go wrong this is set to true. 
I have convinced myself by means of adding some debug output that the
function pn_delivery_state_init is getting called on my delivery object, so
it should be initialised as true, but for some reason by the time
pn_delivery_finalize is called that is no longer the case.

Has anyone else seen either of these assertions trigger?

I've put in a whole load of debug output to try to work out where the
problem is, but am getting nowhere with it.

Any ideas or help gratefully received.

Thanks,
Toby



--
View this message in context: http://qpid.2158936.n2.nabble.com/pn-delivery-finalize-Assertion-failure-tp7660170.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: pn_delivery_finalize Assertion failure

Posted by Tobias Duckworth <to...@duckworth.uk.com>.
Oh....

I also see delivery->local.settled as true when the problem happens, but it
is false when the problem does not happen.

Thanks,
Toby



--
View this message in context: http://qpid.2158936.n2.nabble.com/pn-delivery-finalize-Assertion-failure-tp7660170p7660171.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: pn_delivery_finalize Assertion failure

Posted by Alan Conway <ac...@redhat.com>.
On Fri, 2017-03-24 at 15:42 +0100, Rob Godfrey wrote:
> 
> > Note that unsettled deliveries will outlive the link they were
> > created on, the AMQP spec allows settlement after the link has
> > closed.
> > 
> > 
> 
> To be precise the spec allows for the state of the delivery to be
> updated
> after the link has been *detached*.��Once a link is closed then the
> delivery should no longer be updatable through disposition exchanges,
> etc
> (the default outcome for the source should be applied at that point).
> 
> (From 2.7.6 Dispositions : "The disposition performative MAY refer to
> deliveries on links that are no longer attached. As long as the links
> have
> not been closed or detached with an error then the deliveries are
> still
> 'live' and the updated state MUST be applied." -
> http://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-transport-v1.0
> -os.html#type-disposition
> )
> 
> -- Rob

Thanks Rob, I was confused on that point - that makes much more sense.




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: pn_delivery_finalize Assertion failure

Posted by Rob Godfrey <ro...@gmail.com>.
On 24 March 2017 at 14:38, Alan Conway <ac...@redhat.com> wrote:

> On Thu, 2017-03-09 at 02:06 -0700, Tobias Duckworth wrote:
> > I got to the bottom of this problem.
> >
> > The problem is to to with the lifetime of the proton::delivery
> > object.
> >
> > In a connection_engine build, the IO is asynchronous to the
> > proton::handler
> > interface, and therefore this problem may only occur in
> > connection_engine
> > builds (not absolutely certain, as I've not tested on a 'normal'
> > build).
> >
> > When a message is received, it comes along with a delivery object.
> > The
> > delivery object can be used to accept, reject, release the delivery
> > with the
> > broker.
> >
> > Internally to the proton-c C implementation there are a complicated
> > set of
> > state transitions that happen which I have not totally got my head
> > around.
> > However, what I do know is that the delivery object needs to stick
> > around
> > until the operation is totally finished - Otherwise it's possible for
> > the
> > observed behaviour to happen, which causes the assertion failures or
> > a much
> > nastier crash.
> >
> > The call to proton::delivery::accept (or release, reject) returns
> > almost
> > straightaway, and then the actual work of settling the message with
> > the
> > broker happens when the underlying IO operation is invoked with
> > process(),
> > or process_nothrow(). If, by this time, the delivery object that was
> > used
> > has been deleted or destroyed, then the problem will happen
> >
> > Turning on tracing using PN_TRACE_FRM, one can see the following when
> > the
> > engine has finished with the delivery:
> >
> > [0x2b4af8030500]:1 -> @disposition(21) [role=true, first=124,
> > last=124,
> > settled=true, state=@accepted(36) []] .
> >
> > Unfortunately, there is no interface back to the C++ world to tell us
> > when
> > the delivery is finished with, and so there appears to be no way to
> > know
> > when it is safe to destroy the delivery object.
> >
>
> There are callbacks, but on the sender side the delivery is referred to
> as a "tracker", "delivery" is only used on the receiver side. This is
> an attempt to rationalize the API since there are different ops
> relevant on sender and receiver side.
>
> > Other than keeping proton::delivery objects sticking around in memory
> > forever (which does solve the problem, but is not an option), does
> > anyone
> > have any ideas for solving this?
> >
> > It seems to me like the C++ proton::delivery object itself should
> > know about
> > whether it has been settled with the broker or not, so unless anyone
> > has a
> > better suggestion this is likely to be what I will add.
> >
>
> They should be deleted automatically without user intervention after
> they settle and all events are processed. If you have a reproducer to
> show they are not then raise a JIRA (even better if you attach a fix
> :).



> Note that unsettled deliveries will outlive the link they were
> created on, the AMQP spec allows settlement after the link has closed.
>
>
To be precise the spec allows for the state of the delivery to be updated
after the link has been *detached*.  Once a link is closed then the
delivery should no longer be updatable through disposition exchanges, etc
(the default outcome for the source should be applied at that point).

(From 2.7.6 Dispositions : "The disposition performative MAY refer to
deliveries on links that are no longer attached. As long as the links have
not been closed or detached with an error then the deliveries are still
'live' and the updated state MUST be applied." -
http://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-transport-v1.0-os.html#type-disposition
)

-- Rob


>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
>

Re: pn_delivery_finalize Assertion failure

Posted by Alan Conway <ac...@redhat.com>.
On Thu, 2017-03-09 at 02:06 -0700, Tobias Duckworth wrote:
> I got to the bottom of this problem.
> 
> The problem is to to with the lifetime of the proton::delivery
> object.
> 
> In a connection_engine build, the IO is asynchronous to the
> proton::handler
> interface, and therefore this problem may only occur in
> connection_engine
> builds (not absolutely certain, as I've not tested on a 'normal'
> build).
> 
> When a message is received, it comes along with a delivery object.
> The
> delivery object can be used to accept, reject, release the delivery
> with the
> broker.
> 
> Internally to the proton-c C implementation there are a complicated
> set of
> state transitions that happen which I have not totally got my head
> around.
> However, what I do know is that the delivery object needs to stick
> around
> until the operation is totally finished - Otherwise it's possible for
> the
> observed behaviour to happen, which causes the assertion failures or
> a much
> nastier crash.
> 
> The call to proton::delivery::accept (or release, reject) returns
> almost
> straightaway, and then the actual work of settling the message with
> the
> broker happens when the underlying IO operation is invoked with
> process(),
> or process_nothrow(). If, by this time, the delivery object that was
> used
> has been deleted or destroyed, then the problem will happen
> 
> Turning on tracing using PN_TRACE_FRM, one can see the following when
> the
> engine has finished with the delivery:
> 
> [0x2b4af8030500]:1 -> @disposition(21) [role=true, first=124,
> last=124,
> settled=true, state=@accepted(36) []] .
> 
> Unfortunately, there is no interface back to the C++ world to tell us
> when
> the delivery is finished with, and so there appears to be no way to
> know
> when it is safe to destroy the delivery object.
> 

There are callbacks, but on the sender side the delivery is referred to
as a "tracker", "delivery" is only used on the receiver side. This is
an attempt to rationalize the API since there are different ops
relevant on sender and receiver side.

> Other than keeping proton::delivery objects sticking around in memory
> forever (which does solve the problem, but is not an option), does
> anyone
> have any ideas for solving this?
> 
> It seems to me like the C++ proton::delivery object itself should
> know about
> whether it has been settled with the broker or not, so unless anyone
> has a
> better suggestion this is likely to be what I will add.
> 

They should be deleted automatically without user intervention after
they settle and all events are processed. If you have a reproducer to
show they are not then raise a JIRA (even better if you attach a fix
:). Note that unsettled deliveries will outlive the link they were
created on, the AMQP spec allows settlement after the link has closed.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: pn_delivery_finalize Assertion failure

Posted by Tobias Duckworth <to...@duckworth.uk.com>.
I got to the bottom of this problem.

The problem is to to with the lifetime of the proton::delivery object.

In a connection_engine build, the IO is asynchronous to the proton::handler
interface, and therefore this problem may only occur in connection_engine
builds (not absolutely certain, as I've not tested on a 'normal' build).

When a message is received, it comes along with a delivery object. The
delivery object can be used to accept, reject, release the delivery with the
broker.

Internally to the proton-c C implementation there are a complicated set of
state transitions that happen which I have not totally got my head around.
However, what I do know is that the delivery object needs to stick around
until the operation is totally finished - Otherwise it's possible for the
observed behaviour to happen, which causes the assertion failures or a much
nastier crash.

The call to proton::delivery::accept (or release, reject) returns almost
straightaway, and then the actual work of settling the message with the
broker happens when the underlying IO operation is invoked with process(),
or process_nothrow(). If, by this time, the delivery object that was used
has been deleted or destroyed, then the problem will happen

Turning on tracing using PN_TRACE_FRM, one can see the following when the
engine has finished with the delivery:

[0x2b4af8030500]:1 -> @disposition(21) [role=true, first=124, last=124,
settled=true, state=@accepted(36) []] .

Unfortunately, there is no interface back to the C++ world to tell us when
the delivery is finished with, and so there appears to be no way to know
when it is safe to destroy the delivery object.

Other than keeping proton::delivery objects sticking around in memory
forever (which does solve the problem, but is not an option), does anyone
have any ideas for solving this?

It seems to me like the C++ proton::delivery object itself should know about
whether it has been settled with the broker or not, so unless anyone has a
better suggestion this is likely to be what I will add.

Thanks,
Toby








--
View this message in context: http://qpid.2158936.n2.nabble.com/pn-delivery-finalize-Assertion-failure-tp7660170p7660305.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: pn_delivery_finalize Assertion failure

Posted by Tobias Duckworth <to...@duckworth.uk.com>.
Here's the corresponding debug output from when the problem does not happen:

(imp_server@ubuntu)1> [0x2b8a60030360]:1 <- @transfer(20) [handle=1,
delivery-id=0,
delivery-tag=b":,\xfa\x02\xad\xc5\x8cN\x8f\xafip\xbb\x97B\xfe",
message-format=0, more=false, batchable=false] (227)
"\x00Sr\xc1W\x06\xa3\x15x-opt-sequence-number\x80\x00\x00\x00\x00\x00\x00\x03C\xa3\x13iothub-enqueuedtime\x83\x00\x00\x01Z\xa9\xd0\xd4\xd3\xa3\x14iothub-deliverycountP\x00\x00Ss\xc0^\x0d\xa1$0b80015f-50f2-48e1-98cc-8ec0b2fdb6c5@\xa1*/devices/nAieblFn7nMO/messages/devicebound@@@@@@@@@@\x00St\xc1\x13\x02\xa1\x0aiothub-ack\xa1\x04full\x00Sw\xc1\x07\x02\xa1\x02idR\x01"
pn_delivery_finalize on delivery 0x2b8a6001a6a0
pni_link_live = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 0
delivery->state.init = 0
delivery->tpwork = 0
pni_preserve_delivery = 1
delivery->referenced = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 0
delivery->state.init = 0
delivery->tpwork = 0
2
pni_delivery_map_push
pn_delivery_state_init on delivery 0x2b8a6001a6a0
[0x2b8a60030360]:1 -> @flow(19) [next-incoming-id=2,
incoming-window=2147483647, next-outgoing-id=1, outgoing-window=2147483647,
handle=1, delivery-count=1, link-credit=10, drain=false]
2017-03-07 17:28:38.986 imp_server@ubuntu [debug] 70eba461caa3130b
<0.1770.0>@agent_host:maybe_show_log:624 log: <<"{ \"id\": 1 }">>
2017-03-07 17:28:38.997 imp_server@ubuntu [debug] 70eba461caa3130b
<0.1770.0>@agent_host:maybe_show_log:624 log: <<"{ \"iothub-ack\": \"full\"
}">>
Accepting delivery 0x2b8a6001b540
AmqpProtonReceiver::accept 0x2b8a6001b540
Calling accept on 0x2b8a6001b540
pn_delivery_settle 0x2b8a6001a6a0
pn_delivery_finalize on delivery 0x2b8a6001a6a0
pni_link_live = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 1
delivery->state.init = 1
delivery->tpwork = 1
pni_preserve_delivery = 1
delivery->referenced = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 1
delivery->state.init = 1
delivery->tpwork = 1
2
pn_delivery_finalize on delivery 0x2b8a6001a6a0
pni_link_live = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 1
delivery->state.init = 1
delivery->tpwork = 0
pni_preserve_delivery = 1
delivery->referenced = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 1
delivery->state.init = 1
delivery->tpwork = 0
2
pn_delivery_finalize on delivery 0x2b8a6001a6a0
pni_link_live = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 1
delivery->state.init = 0
delivery->tpwork = 0
pni_preserve_delivery = 0
delivery->referenced = 1
pni_preserve_delivery 0x2b8a6001a6a0
conn = 0x2b8a6002f4c0
delivery->local.settled = 1
delivery->state.init = 0
delivery->tpwork = 0
3
4
5
6 - pn_refcount(delivery) = 0
7
7.5 - pn_refcount(delivery) = 1
8
10
pn_delivery_finalize on delivery 0x2b8a6001a6a0 exits
[0x2b8a60030360]:1 -> @disposition(21) [role=true, first=0, last=0,
settled=true, state=@accepted(36) []]



--
View this message in context: http://qpid.2158936.n2.nabble.com/pn-delivery-finalize-Assertion-failure-tp7660170p7660180.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: pn_delivery_finalize Assertion failure

Posted by Tobias Duckworth <to...@duckworth.uk.com>.
Here's some debug output from when the problem happens - In this case one of
the assertions triggers, but that is not always the case:

(imp_server@ubuntu)1> [0x2b8bd80026b0]:1 <- @transfer(20) [handle=1,
delivery-id=0,
delivery-tag=b"X\x00\xd2\xebX\x82\x0eO\x92\x08\x87\x92[i\xc6\xbd",
message-format=0, more=false, batchable=false] (227)
"\x00Sr\xc1W\x06\xa3\x15x-opt-sequence-number\x80\x00\x00\x00\x00\x00\x00\x03B\xa3\x13iothub-enqueuedtime\x83\x00\x00\x01Z\xa9\xca\xb0\x1f\xa3\x14iothub-deliverycountP\x00\x00Ss\xc0^\x0d\xa1$fb64f194-f8ee-4615-8adf-d4ae7dc22bd9@\xa1*/devices/nAieblFn7nMO/messages/devicebound@@@@@@@@@@\x00St\xc1\x13\x02\xa1\x0aiothub-ack\xa1\x04full\x00Sw\xc1\x07\x02\xa1\x02idR\x01"
pn_delivery_finalize on delivery 0x2b8bd801a660
pni_link_live = 1
pni_preserve_delivery 0x2b8bd801a660
conn = 0x2b8bd8000d10
delivery->local.settled = 0
delivery->state.init = 0
delivery->tpwork = 0
pni_preserve_delivery = 1
delivery->referenced = 1
pni_preserve_delivery 0x2b8bd801a660
conn = 0x2b8bd8000d10
delivery->local.settled = 0
delivery->state.init = 0
delivery->tpwork = 0
2
pni_delivery_map_push
pn_delivery_state_init on delivery 0x2b8bd801a660
[0x2b8bd80026b0]:1 -> @flow(19) [next-incoming-id=2,
incoming-window=2147483647, next-outgoing-id=1, outgoing-window=2147483647,
handle=1, delivery-count=1, link-credit=10, drain=false]
2017-03-07 17:21:56.376 imp_server@ubuntu [debug] 70eba461caa3130b
<0.1770.0>@agent_host:maybe_show_log:624 log: <<"{ \"id\": 1 }">>
2017-03-07 17:21:56.391 imp_server@ubuntu [debug] 70eba461caa3130b
<0.1770.0>@agent_host:maybe_show_log:624 log: <<"{ \"iothub-ack\": \"full\"
}">>
Accepting delivery 0x2b8bd801b510
AmqpProtonReceiver::accept 0x2b8bd801b510
Calling accept on 0x2b8bd801b510
pn_delivery_settle 0x2b8bd801a660
pn_delivery_finalize on delivery 0x2b8bd801a660
pni_link_live = 1
pni_preserve_delivery 0x2b8bd801a660
conn = 0x2b8bd8000d10
delivery->local.settled = 1
delivery->state.init = 0
delivery->tpwork = 0
pni_preserve_delivery = 0
delivery->referenced = 1
pni_preserve_delivery 0x2b8bd801a660
conn = 0x2b8bd8000d10
delivery->local.settled = 1
delivery->state.init = 0
delivery->tpwork = 0
3
pn_delivery_finalize on delivery 0x2b8bd801a660
pni_link_live = 1
pni_preserve_delivery 0x2b8bd801a660
conn = 0x2b8bd8000d10
delivery->local.settled = 1
delivery->state.init = 0
delivery->tpwork = 0
pni_preserve_delivery = 0
delivery->referenced = 1
pni_preserve_delivery 0x2b8bd801a660
conn = 0x2b8bd8000d10
delivery->local.settled = 1
delivery->state.init = 0
delivery->tpwork = 0
3
4
5
6 - pn_refcount(delivery) = 0
7
7.5 - pn_refcount(delivery) = 1
8
10
pn_delivery_finalize on delivery 0x2b8bd801a660 exits
[0x2b8bd80026b0]:1 -> @disposition(21) [role=true, first=0, last=0,
settled=true, state=@accepted(36) []]
      4
5
6 - pn_refcount(delivery) = 1
beam.smp: thirdparty/qpid-proton/proton-c/src/engine/engine.c:1454:
pn_delivery_finalize: Assertion `pn_refcount(delivery) == 0' failed.



--
View this message in context: http://qpid.2158936.n2.nabble.com/pn-delivery-finalize-Assertion-failure-tp7660170p7660174.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: pn_delivery_finalize Assertion failure

Posted by Tobias Duckworth <to...@duckworth.uk.com>.
In engine.c, the function pn_delivery_settle(pn_delivery_t *delivery) has the
following:

pn_incref(delivery);
pn_decref(delivery);

Is there a reason for this?




--
View this message in context: http://qpid.2158936.n2.nabble.com/pn-delivery-finalize-Assertion-failure-tp7660170p7660172.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org