You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@qpid.apache.org by Jeff Donner <jd...@morphodetection.com> on 2017/01/14 02:43:09 UTC

Slow-starting queue visibility and handling it

Hi,

  We're using proton 14, and qpidd 1.35. We probably have bad routing at one spot from a proton client to a qpidd broker on another machine, and the client is slow to see the queue (it fails out with "amqp:no-found"), even though the broker has been running steadily and the ping time is ~500ms. qpid-tool is also slow to see anything - you can type list for 5 seconds before you see any queues (it takes about 5 seconds even run locally, though it can take much longer run remotely). We presume something similar is happening to our proton client (though it looks for a specific queue, so there should be no time spent collecting metainfo).

The client fails instantly - if the client were failing to connect at the TCP level wouldn't TCP just wait for a few seconds?

We added an undocumented 'proton::reconnect_timer' to client options:

    proton::reconnect_timer reconnection_info(/*first=*/0,
                                              /*max delay=*/-1,
                                              /*increment=*/500,
                                              /*doubling=*/false,
                                              /*max_retries=*/20,
                                              /*timeout=*/-1);

    proton::connection_options client_conn_opts;
    client_conn_opts.reconnect(reconnection_info);
    c.client_connection_options(client_conn_opts);
    connection_ = c.connect(get_url());

// ... elsewhere ...

    client_thread_ = std::make_unique<std::thread>([this]()
    {
      LOGF(logger_, LoggerIF::INFO,
           "client thread started using URL %s",
           central_conf_->client_url.c_str());
      for (;;)
      {
        try
        {
          proton::default_container(*client_).run();
          LOGF(logger_, LoggerIF::INFO, "client completed");
        }
        catch (const std::exception& e)
        {
          LOGF(logger_, LoggerIF::ERROR,
               "client completed due to exception: %s", e.what());
        }
        client_->close();
      }
      return 0;
    });


But it doesn't seem to prevent the initial failing. Is the reconnect_timer just for /dropped/ connections?

Is there any mechanism to cope with slow connections? This code all works fine on better-connected machines.

Thanks,
Jeff

RE: Slow-starting queue visibility and handling it

Posted by Jeff Donner <jd...@morphodetection.com>.

Doh. It turned out we had a mismatched queue name. Sorry ...! And thanks. 

Jeff
________________________________________
From: Gordon Sim [gsim@redhat.com]
Sent: Monday, January 16, 2017 1:55 AM
To: users@qpid.apache.org
Subject: Re: Slow-starting queue visibility and handling it

On 14/01/17 02:43, Jeff Donner wrote:
>   We're using proton 14, and qpidd 1.35. We probably have bad routing at one spot from a proton client to a qpidd broker on another machine, and the client is slow to see the queue (it fails out with "amqp:no-found"), even though the broker has been running steadily and the ping time is ~500ms. qpid-tool is also slow to see anything - you can type list for 5 seconds before you see any queues (it takes about 5 seconds even run locally, though it can take much longer run remotely). We presume something similar is happening to our proton client (though it looks for a specific queue, so there should be no time spent collecting metainfo).
>
> The client fails instantly - if the client were failing to connect at the TCP level wouldn't TCP just wait for a few seconds?

[...]

> But it doesn't seem to prevent the initial failing. Is the reconnect_timer just for /dropped/ connections?
>
> Is there any mechanism to cope with slow connections? This code all works fine on better-connected machines.

Usually, the amqp:not-found error  is sent back by the broker when the
client tries to attach to a queue that doesn't exist. How is the queue
in question being created?

I don't *think* this has anything to do with 'slowness', if it is indeed
the broker that is sending back that error, as I think it must be. Is it
possible it is somehow connecting to the wrong broker?

You say 'initial failing', does that mean that if it retries it usually
succeeds?

Can you get a protocl trace from the failing client (Run with env var
PN_TRACE_FRM=1)? Or a wireshark trace?


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org

Re: Slow-starting queue visibility and handling it

Posted by Gordon Sim <gs...@redhat.com>.

On 14/01/17 02:43, Jeff Donner wrote:
>   We're using proton 14, and qpidd 1.35. We probably have bad routing at one spot from a proton client to a qpidd broker on another machine, and the client is slow to see the queue (it fails out with "amqp:no-found"), even though the broker has been running steadily and the ping time is ~500ms. qpid-tool is also slow to see anything - you can type list for 5 seconds before you see any queues (it takes about 5 seconds even run locally, though it can take much longer run remotely). We presume something similar is happening to our proton client (though it looks for a specific queue, so there should be no time spent collecting metainfo).
>
> The client fails instantly - if the client were failing to connect at the TCP level wouldn't TCP just wait for a few seconds?

[...]

> But it doesn't seem to prevent the initial failing. Is the reconnect_timer just for /dropped/ connections?
>
> Is there any mechanism to cope with slow connections? This code all works fine on better-connected machines.

Usually, the amqp:not-found error  is sent back by the broker when the 
client tries to attach to a queue that doesn't exist. How is the queue 
in question being created?

I don't *think* this has anything to do with 'slowness', if it is indeed 
the broker that is sending back that error, as I think it must be. Is it 
possible it is somehow connecting to the wrong broker?

You say 'initial failing', does that mean that if it retries it usually 
succeeds?

Can you get a protocl trace from the failing client (Run with env var 
PN_TRACE_FRM=1)? Or a wireshark trace?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org