You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Rafael Schloming <ra...@redhat.com> on 2010/03/02 05:45:08 UTC

Re: SubscriptionManager performance problem.

Gordon Sim wrote:
> On 02/26/2010 03:09 PM, Alan Conway wrote:
>> On 02/26/2010 09:47 AM, Gordon Sim wrote:
>>> On 02/26/2010 02:29 PM, Alan Conway wrote:
>>>> Gordon/Rafi: this raises an interesting question for the new APIs. It
>>>> seems like async declare/bind/subscribe are important features for 
>>>> cases
>>>> like this.
>>>
>>> I agree, this is an interesting case. On the face of it my initial
>>> suggestion would be a single receiver for an address using create:
>>> always and a list of binding 20,000 long.
>>
>> You mean construct a single address string with 20,000 bindings in it?
> 
> Wouldn't necessarily need to be done as a string. The 
> qpid::messaging::Address class can be manipulated directly. However...
> 
>> That doesn't sound practical. I think we need a way to split apart the
>> process of constructing a receiver in this case so we can create a
>> receiver and then incrementally add bindings to it. That would seem like
>> a useful feature in any case - the set of things a receiver is
>> interested in may change dynamically over its lifetime so it would be
>> good to be able to add/remove bindings to a receiver.
> 
> ...I can see that being a valuable addition for some cases, yes.

It turns out that the python client (sort of by accident) is actually 
capable of doing an async query/declare/bind/subscribe.

It just so happens that if you create receivers on a disconnected 
connection they get created locally, but not remotely until the 
connection is connected again, and when this happens they're all created 
in a big batch.

Out of curiosity I tried creating 20,000 separate receivers in this 
manner with the python client and it took about 75 seconds:

c = Connection("localhost", 5672)
s = c.session()

rcvs = []
for i in range(20000):
   rcvs.append(s.receiver("amq.topic/%s" % i))

print "connecting"
c.connect()
print "connected"

Unfortunately if you connect() before creating the receivers then the 
receiver() call will block until the resulting receiver is fully 
subscribed before returning, thereby rendering the whole process 
synchronous again. This version of the code was still running after 30 
minutes, although part of that is due to a linear search through the 
receivers list:

c = Connection("localhost", 5672)
s = c.session()

print "connecting"
c.connect()

rcvs = []
for i in range(20000):
   rcvs.append(s.receiver("amq.topic/%s" % i))

print "connected"

It would, however, be straightforward to add an option to the receiver 
call to control explicitly whether or not it blocks, thereby permitting 
asynchronous receiver creation on a connected connection.

--Rafael


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: SubscriptionManager performance problem.

Posted by Rafael Schloming <ra...@redhat.com>.
Alan Conway wrote:
> On 03/01/2010 11:45 PM, Rafael Schloming wrote:
>> Gordon Sim wrote:
>>> On 02/26/2010 03:09 PM, Alan Conway wrote:
>>>> On 02/26/2010 09:47 AM, Gordon Sim wrote:
>>>>> On 02/26/2010 02:29 PM, Alan Conway wrote:
>>>>>> Gordon/Rafi: this raises an interesting question for the new APIs. It
>>>>>> seems like async declare/bind/subscribe are important features for
>>>>>> cases
>>>>>> like this.
>>>>>
>>>>> I agree, this is an interesting case. On the face of it my initial
>>>>> suggestion would be a single receiver for an address using create:
>>>>> always and a list of binding 20,000 long.
>>>>
>>>> You mean construct a single address string with 20,000 bindings in it?
>>>
>>> Wouldn't necessarily need to be done as a string. The
>>> qpid::messaging::Address class can be manipulated directly. However...
>>>
>>>> That doesn't sound practical. I think we need a way to split apart the
>>>> process of constructing a receiver in this case so we can create a
>>>> receiver and then incrementally add bindings to it. That would seem 
>>>> like
>>>> a useful feature in any case - the set of things a receiver is
>>>> interested in may change dynamically over its lifetime so it would be
>>>> good to be able to add/remove bindings to a receiver.
>>>
>>> ...I can see that being a valuable addition for some cases, yes.
>>
>> It turns out that the python client (sort of by accident) is actually
>> capable of doing an async query/declare/bind/subscribe.
>>
>> It just so happens that if you create receivers on a disconnected
>> connection they get created locally, but not remotely until the
>> connection is connected again, and when this happens they're all created
>> in a big batch.
>>
>> Out of curiosity I tried creating 20,000 separate receivers in this
>> manner with the python client and it took about 75 seconds:
>>
>> c = Connection("localhost", 5672)
>> s = c.session()
>>
>> rcvs = []
>> for i in range(20000):
>> rcvs.append(s.receiver("amq.topic/%s" % i))
>>
>> print "connecting"
>> c.connect()
>> print "connected"
>>
>> Unfortunately if you connect() before creating the receivers then the
>> receiver() call will block until the resulting receiver is fully
>> subscribed before returning, thereby rendering the whole process
>> synchronous again. This version of the code was still running after 30
>> minutes, although part of that is due to a linear search through the
>> receivers list:
>>
>> c = Connection("localhost", 5672)
>> s = c.session()
>>
>> print "connecting"
>> c.connect()
>>
>> rcvs = []
>> for i in range(20000):
>> rcvs.append(s.receiver("amq.topic/%s" % i))
>>
>> print "connected"
>>
>> It would, however, be straightforward to add an option to the receiver
>> call to control explicitly whether or not it blocks, thereby permitting
>> asynchronous receiver creation on a connected connection.
>>
> 
> Sounds like that would be a worthwhile extension before we release the 
> API, with a similar option on the C++ side.

Agreed

> What about declaring 
> queues/exchanges and creating bindings?

Addresses can include queue, exchange, and binding declarations, so you 
can always do this via asynchronously declaring a whole bunch of 
senders. This is natural if you intend to use the senders, but it is 
obviously a bit hackish if you are just creating them for the side effect.

I do think we need a better way to do this in the latter case, however I 
think that this would fall more into the management side of things than 
into the messaging API proper. We just need to be sure they work 
naturally with each other.

One thing that isn't covered though is the ability to adjust an address 
while in use, e.g. update the address definition to add or remove a 
binding without resorting to the above hack. IMHO this is something that 
should ultimately be covered in a slightly nicer way by the API, 
although I think we can add it in later without too much trouble.

--Rafael


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: SubscriptionManager performance problem.

Posted by Alan Conway <ac...@redhat.com>.
On 03/01/2010 11:45 PM, Rafael Schloming wrote:
> Gordon Sim wrote:
>> On 02/26/2010 03:09 PM, Alan Conway wrote:
>>> On 02/26/2010 09:47 AM, Gordon Sim wrote:
>>>> On 02/26/2010 02:29 PM, Alan Conway wrote:
>>>>> Gordon/Rafi: this raises an interesting question for the new APIs. It
>>>>> seems like async declare/bind/subscribe are important features for
>>>>> cases
>>>>> like this.
>>>>
>>>> I agree, this is an interesting case. On the face of it my initial
>>>> suggestion would be a single receiver for an address using create:
>>>> always and a list of binding 20,000 long.
>>>
>>> You mean construct a single address string with 20,000 bindings in it?
>>
>> Wouldn't necessarily need to be done as a string. The
>> qpid::messaging::Address class can be manipulated directly. However...
>>
>>> That doesn't sound practical. I think we need a way to split apart the
>>> process of constructing a receiver in this case so we can create a
>>> receiver and then incrementally add bindings to it. That would seem like
>>> a useful feature in any case - the set of things a receiver is
>>> interested in may change dynamically over its lifetime so it would be
>>> good to be able to add/remove bindings to a receiver.
>>
>> ...I can see that being a valuable addition for some cases, yes.
>
> It turns out that the python client (sort of by accident) is actually
> capable of doing an async query/declare/bind/subscribe.
>
> It just so happens that if you create receivers on a disconnected
> connection they get created locally, but not remotely until the
> connection is connected again, and when this happens they're all created
> in a big batch.
>
> Out of curiosity I tried creating 20,000 separate receivers in this
> manner with the python client and it took about 75 seconds:
>
> c = Connection("localhost", 5672)
> s = c.session()
>
> rcvs = []
> for i in range(20000):
> rcvs.append(s.receiver("amq.topic/%s" % i))
>
> print "connecting"
> c.connect()
> print "connected"
>
> Unfortunately if you connect() before creating the receivers then the
> receiver() call will block until the resulting receiver is fully
> subscribed before returning, thereby rendering the whole process
> synchronous again. This version of the code was still running after 30
> minutes, although part of that is due to a linear search through the
> receivers list:
>
> c = Connection("localhost", 5672)
> s = c.session()
>
> print "connecting"
> c.connect()
>
> rcvs = []
> for i in range(20000):
> rcvs.append(s.receiver("amq.topic/%s" % i))
>
> print "connected"
>
> It would, however, be straightforward to add an option to the receiver
> call to control explicitly whether or not it blocks, thereby permitting
> asynchronous receiver creation on a connected connection.
>

Sounds like that would be a worthwhile extension before we release the API, with 
a similar option on the C++ side. What about declaring queues/exchanges and 
creating bindings?

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org