You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Robbie Gemmell (Jira)" <ji...@apache.org> on 2021/04/22 14:45:00 UTC

[jira] [Commented] (PROTON-2378) [Python] AMQP client with only sender gets stuck on select timing out

    [ https://issues.apache.org/jira/browse/PROTON-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329166#comment-17329166 ] 

Robbie Gemmell commented on PROTON-2378:
----------------------------------------

Python is not my thing, but would I be right in thinking you have 2 threads here, one 'sending' and another one running the container? If so that is your problem.

The container is inherently single threaded, all work such must occur on the thread running the container. You need to operate directly in it, or coordinate passing it work from other threads if you have them, you cant e.g just call send from another thread.

The container will only process outgoing IO after it has just done something. With the illegal multi threading, I expect the container will only be processing outgoing IO just as it goes back to sleep on select, then either waiting for the 3.14 seconds you noted or until incoming network activity (e.g received messages, dispositions, etc) wakes it earlier, at which point it might then process more outgoing IO before going back to sleep again. Your 'send every second' from the other thread is essentially just buffering stuff for it to do, then randomly having it sent by the container thread at a later point after it wakes itself. Meanwhile the two threads are free to corrupt the data if they trip over each other, as the container is not thread safe.

I believe https://github.com/apache/qpid-proton/blob/0.34.0/python/examples/db_send.py has example of work from another thread using its injector bits, while https://github.com/apache/qpid-proton/blob/0.34.0/python/examples/recurring_timer.py shows a more simplistic use of basic scheduling that you could use to e.g simplistically poll a work queue for stuff to send.

> [Python] AMQP client with only sender gets stuck on select timing out
> ---------------------------------------------------------------------
>
>                 Key: PROTON-2378
>                 URL: https://issues.apache.org/jira/browse/PROTON-2378
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-c, python-binding
>    Affects Versions: proton-c-0.33.0
>            Reporter: Sebastiaan la Fleur
>            Priority: Major
>         Attachments: test_receiver.py, test_sender.py
>
>
> Python version: 3.6.8
> Unfortunately not yet tested on proton-c-0.34.0 as the pypi release is still at 0.33.0.
>  
> Situation:
> There is an AMQP receiver and an AMQP sender with only 1 link being used from the sender to the receiver. There is data sent every 1 second by the sender.
> In my testing of the proton library for Python I noticed that sending messages is sometimes delayed up to 3.14 seconds. It appears the code gets 'stuck' on the line
> {code:java}
> readable, writable, expired = self._selector.select(r.timeout){code}
> at _handlers.py.IOHandler.on_reactor_quiesced (line 1210: [https://github.com/apache/qpid-proton/blob/0.33.0/python/proton/_handlers.py#L1210] )
> The value of `r.timeout` always appears to be 3.14 which is set at line 718 _reactor.py.Reactor.run.
> {code:java}
> self.timeout = 3.14159265359{code}
>  
> All-in-all this shows behavior where no new TCP data or AMQP frames are expected by the sender so the select waits and times out. However, there is new data to be sent by the Sender but this does not cause the select to terminate early. Therefore, the messages send by the sender are 'batched' in 3's before being send to the receiver.
>  
> Attached is an AMQP receiver and sender meant as an minimal example to showcase the issue.
> The Sender logs the sent & accepted timestamps. We would expect a message to be send every 1 second and it should be accepted in milliseconds. Instead it can take up to 3.14 seconds to be accepted. An example log:
>  
> {code:java}
>  Delivery 1 which was sent at 1619095634.3362126 was accepted at 1619095634.3540177 so it took 0.017805099487304688 seconds.
>  Delivery 2 which was sent at 1619095635.3695796 was accepted at 1619095637.5099232 so it took 2.14034366607666 seconds.
>  Delivery 3 which was sent at 1619095636.3740473 was accepted at 1619095637.5107296 so it took 1.1366822719573975 seconds.
>  Delivery 4 which was sent at 1619095637.381199 was accepted at 1619095637.5114946 so it took 0.1302957534790039 seconds.
>  Delivery 5 which was sent at 1619095638.383219 was accepted at 1619095640.671438 so it took 2.2882189750671387 seconds.
>  Delivery 6 which was sent at 1619095639.384885 was accepted at 1619095640.6722338 so it took 1.287348747253418 seconds.
>  Delivery 7 which was sent at 1619095640.3868084 was accepted at 1619095640.6729164 so it took 0.28610801696777344 seconds.
>  Delivery 8 which was sent at 1619095641.3904433 was accepted at 1619095643.8257194 so it took 2.4352760314941406 seconds.
>  Delivery 9 which was sent at 1619095642.3931684 was accepted at 1619095643.8264139 so it took 1.4332454204559326 seconds.
>  Delivery 10 which was sent at 1619095643.3968666 was accepted at 1619095643.8271492 so it took 0.4302825927734375 seconds.
>  Delivery 11 which was sent at 1619095644.4198678 was accepted at 1619095646.986702 so it took 2.5668342113494873 seconds.
>  Delivery 12 which was sent at 1619095645.420819 was accepted at 1619095646.9876444 so it took 1.5668253898620605 seconds.
>  Delivery 13 which was sent at 1619095646.421948 was accepted at 1619095646.988455 so it took 0.56650710105896 seconds.
>  Delivery 14 which was sent at 1619095647.4390144 was accepted at 1619095650.140986 so it took 2.7019715309143066 seconds.
>  Delivery 15 which was sent at 1619095648.4405417 was accepted at 1619095650.1420805 so it took 1.7015388011932373 seconds.
>  Delivery 16 which was sent at 1619095649.4577622 was accepted at 1619095650.1430688 so it took 0.6853065490722656 seconds.
>  Delivery 17 which was sent at 1619095650.5015457 was accepted at 1619095653.3227403 so it took 2.821194648742676 seconds.
>  Delivery 18 which was sent at 1619095651.5050397 was accepted at 1619095653.3230667 so it took 1.8180270195007324 seconds.
>  Delivery 19 which was sent at 1619095652.5123448 was accepted at 1619095653.3235378 so it took 0.8111929893493652 seconds.
>  Delivery 20 which was sent at 1619095653.5286038 was accepted at 1619095656.4846065 so it took 2.956002712249756 seconds.
>  Delivery 21 which was sent at 1619095654.5309548 was accepted at 1619095656.4857142 so it took 1.9547593593597412 seconds.{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org