You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@qpid.apache.org by Christopher Morgan <Ch...@solace.com> on 2018/03/22 21:23:56 UTC

Proton-c transport failure on reconnect/failover for brokers that begin the session delivery id sequence not at 0

Hi all,

I'm trying to write a failover, with a host list, example application for a solace amqp broker using the qpid-proton-python using the example code from http://qpid.2158936.n2.nabble.com/About-failover-td7649247.html as a base. When I ran the application I got a lot transport errors from pn_do_transfer in transport.c:
<code>
if (id_present && id != state->id) {
      return pn_do_error(transport, "amqp:session:invalid-field",
                         "sequencing error, expected delivery-id %u, got %u",
                         state->id, id);
}
</code>

The errors occurred on the new connection on the first transfer frame with the delivery id 1. The solace amqp broker begins its delivery id sequence at 1. But, on the reconnected session the ssn->state.incoming->next value seems to set to 0. I noticed there is code to set the ssn->state.incoming->next value to the first incoming transfer base on the flag ssn->state.incoming_init. When I added some extra logs around that value I noticed the ssn->state.incoming_init flag is not reset when reconnecting and is set to 0 as a part of the transport unbind process. I also modified the pn_session_unbind function in engine.c to set the ssn->state.incoming_init to false and it seemed to fix my issue.

I'm somewhat new to using the proton-c source and was curious if this is a bug? If not, why wouldn't the incoming_init be reset? If so, is there a better place to reset the flag? Are there more places?

Thanks

Chris Morgan

Re: Proton-c transport failure on reconnect/failover for brokers that begin the session delivery id sequence not at 0

Posted by cmorgan <Ch...@solace.com>.

 
> Its probably just been expected it would start at 0 as it does elsewhere, 
> but I dont see anything in the spec requiring that and it isnt 
> negotiated. 

I was hoping that was the case :)

> You can raise JIRAs at https://issues.apache.org/jira/browse/PROTON. 
> If you wanted you could even try a patch, or raise a PR. Include the 
> JIRA key in the PR title and commit messages and theyll get lined up 
> for some bot handling later (comments etc), e.g see an existing PR 
> such as https://github.com/apache/qpid-proton/pull/134. 

Thanks I'll raise a jira issue.


aconway.rh wrote
> Are you re-using the same pn_transport_t for the new connection? I don't 
> think pn_transport_t was designed with the expectation of being re-used,
> so 
> I wouldn't be surprised if it doesn't work. 
> I'd throw away the old transport and create a new one to re-connect. I 
> believe that's what the C++ reconnect code does - you may find that useful 
> for inspiration, it's based on the C library. 
> 
> https://github.com/alanconway/qpid-proton/blob/ruby-schedule/proton-c/bindings/cpp/src/proactor_container_impl.cpp#L1

My sample is in python using the python-proton binding. As far as I can tell
the python-binding code just sets the next uri on the connection and reuses
the connection struct (as well as the session struct) not the transport. I'm
not sure but, wouldn't connection.open() create a new transport? 

Also thank you for the reconnect code. I will be making samples using
proton-c direct eventually. And the cpp code seems to handle
amqp:connection:forced conditions where the python-bindings reconnect logic
does not which concerned me about the python binding. This might be another
issue.

Chris Morgan




--
Sent from: http://qpid.2158936.n2.nabble.com/Apache-Qpid-users-f2158936.html

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: Proton-c transport failure on reconnect/failover for brokers that begin the session delivery id sequence not at 0

Posted by Alan Conway <ac...@redhat.com>.

On Fri, Mar 23, 2018 at 9:10 AM, Robbie Gemmell <ro...@gmail.com>
wrote:

> The C code isn't in my wheelhouse, but it sounds like a bug. Its
> probably just been expected it would start at 0 as it does elsewhere,
> but I dont see anything in the spec requiring that and it isnt
> negotiated.
>
> You can raise JIRAs at https://issues.apache.org/jira/browse/PROTON.
> If you wanted you could even try a patch, or raise a PR. Include the
> JIRA key in the PR title and commit messages and theyll get lined up
> for some bot handling later (comments etc), e.g see an existing PR
> such as https://github.com/apache/qpid-proton/pull/134.
>
> Robbie
>
> On 22 March 2018 at 21:23, Christopher Morgan
> <Ch...@solace.com> wrote:
> > Hi all,
> >
> > I'm trying to write a failover, with a host list, example application
> for a solace amqp broker using the qpid-proton-python using the example
> code from http://qpid.2158936.n2.nabble.com/About-failover-td7649247.html
> as a base. When I ran the application I got a lot transport errors from
> pn_do_transfer in transport.c:
> > <code>
> > if (id_present && id != state->id) {
> >       return pn_do_error(transport, "amqp:session:invalid-field",
> >                          "sequencing error, expected delivery-id %u, got
> %u",
> >                          state->id, id);
> > }
> > </code>
> >
> > The errors occurred on the new connection on the first transfer frame
> with the delivery id 1. The solace amqp broker begins its delivery id
> sequence at 1. But, on the reconnected session the
> ssn->state.incoming->next value seems to set to 0. I noticed there is code
> to set the ssn->state.incoming->next value to the first incoming transfer
> base on the flag ssn->state.incoming_init. When I added some extra logs
> around that value I noticed the ssn->state.incoming_init flag is not reset
> when reconnecting and is set to 0 as a part of the transport unbind
> process. I also modified the pn_session_unbind function in engine.c to set
> the ssn->state.incoming_init to false and it seemed to fix my issue.
> >
> > I'm somewhat new to using the proton-c source and was curious if this is
> a bug? If not, why wouldn't the incoming_init be reset? If so, is there a
> better place to reset the flag? Are there more places?
> >
>

Are you re-using the same pn_transport_t for the new connection? I don't
think pn_transport_t was designed with the expectation of being re-used, so
I wouldn't be surprised if it doesn't work.
I'd throw away the old transport and create a new one to re-connect. I
believe that's what the C++ reconnect code does - you may find that useful
for inspiration, it's based on the C library.

https://github.com/alanconway/qpid-proton/blob/ruby-schedule/proton-c/bindings/cpp/src/proactor_container_impl.cpp#L1

Re: Proton-c transport failure on reconnect/failover for brokers that begin the session delivery id sequence not at 0

Posted by Robbie Gemmell <ro...@gmail.com>.

The C code isn't in my wheelhouse, but it sounds like a bug. Its
probably just been expected it would start at 0 as it does elsewhere,
but I dont see anything in the spec requiring that and it isnt
negotiated.

You can raise JIRAs at https://issues.apache.org/jira/browse/PROTON.
If you wanted you could even try a patch, or raise a PR. Include the
JIRA key in the PR title and commit messages and theyll get lined up
for some bot handling later (comments etc), e.g see an existing PR
such as https://github.com/apache/qpid-proton/pull/134.

Robbie

On 22 March 2018 at 21:23, Christopher Morgan
<Ch...@solace.com> wrote:
> Hi all,
>
> I'm trying to write a failover, with a host list, example application for a solace amqp broker using the qpid-proton-python using the example code from http://qpid.2158936.n2.nabble.com/About-failover-td7649247.html as a base. When I ran the application I got a lot transport errors from pn_do_transfer in transport.c:
> <code>
> if (id_present && id != state->id) {
>       return pn_do_error(transport, "amqp:session:invalid-field",
>                          "sequencing error, expected delivery-id %u, got %u",
>                          state->id, id);
> }
> </code>
>
> The errors occurred on the new connection on the first transfer frame with the delivery id 1. The solace amqp broker begins its delivery id sequence at 1. But, on the reconnected session the ssn->state.incoming->next value seems to set to 0. I noticed there is code to set the ssn->state.incoming->next value to the first incoming transfer base on the flag ssn->state.incoming_init. When I added some extra logs around that value I noticed the ssn->state.incoming_init flag is not reset when reconnecting and is set to 0 as a part of the transport unbind process. I also modified the pn_session_unbind function in engine.c to set the ssn->state.incoming_init to false and it seemed to fix my issue.
>
> I'm somewhat new to using the proton-c source and was curious if this is a bug? If not, why wouldn't the incoming_init be reset? If so, is there a better place to reset the flag? Are there more places?
>
> Thanks
>
> Chris Morgan

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org