You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Alan Conway <al...@gmail.com> on 2014/05/14 22:12:53 UTC

[router] A couple of questions on router code.

I'm getting up to speed with the router code, I had a couple of
questions:

alloc.c: Do we have performance data that shows this is better than
malloc?
Modern malloc() has optimizations for per-thread, small-object etc. etc.
In my experience it's very hard to beat. The problem with custom alloc
is:
- it's more code to maintain, it's hard to get right.
- it makes tools like valgrind much less useful.

qd_buffer_set_size/qd_buffer is not thread safe. We don't seem to use
set_size,
we should get rid of it if we don't need it or make it threads safe if
we do.

router_semantics_for_addr: linear search for address. This is not on the
critical path for messages but it is on the path for subscriptions and
we've
seen situations in the past where creating subscriptions can be critical
too.

Thanks, and apologies if I've missed anything obvious.

Cheers,
Alan.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Re: [router] A couple of questions on router code.

Posted by Rafael Schloming <rh...@alum.mit.edu>.
On Thu, May 15, 2014 at 10:35 AM, Ted Ross <tr...@redhat.com> wrote:

>
>
> On 05/14/2014 04:12 PM, Alan Conway wrote:
> > I'm getting up to speed with the router code, I had a couple of
> > questions:
> >
> > alloc.c: Do we have performance data that shows this is better than
> > malloc?
> > Modern malloc() has optimizations for per-thread, small-object etc. etc.
> > In my experience it's very hard to beat. The problem with custom alloc
> > is:
> > - it's more code to maintain, it's hard to get right.
> > - it makes tools like valgrind much less useful.
>
> We don't have any benchmark data for this.  It would be good to find out
> how it actually performs.
>
> This allocation mechanism comes from our experience with the C++ broker
> and problems with the general-purpose heap manager in Linux.  Because a
> broker/router will very frequently allocate memory on one thread and
> free it on another, there is either lock contention or significant
> over-allocation.  We need to disable the "arena" optimizations in malloc
> to prevent memory exhaustion.
>
> The alloc.c module provides per-thread free-pools with limited size
> thereby amortizing the locking overhead for many operations into a
> single rebalancing of memory between threads (i.e. a lock is not taken
> out during alloc/free unless there is a need to rebalance a block of
> memory into or out of the thread pool).
>
> Dispatch uses alloc.c in select cases, where rapid and frequent
> allocation is expected.  The general heap or static allocation is used
> everywhere else.
>

Can you describe in more detail the cases you mention? I've been looking
into the possibility of adding some sort of buffer pooling to proton
internals, possibly with an eye to expose/allow integration with external
pooling strategies, so I'd be interested in getting more details on this,
particularly if any of the cases you mention intersect with proton buffers.

--Rafael

Re: [router] A couple of questions on router code.

Posted by Ted Ross <tr...@redhat.com>.

On 05/14/2014 04:12 PM, Alan Conway wrote:
> I'm getting up to speed with the router code, I had a couple of
> questions:
> 
> alloc.c: Do we have performance data that shows this is better than
> malloc?
> Modern malloc() has optimizations for per-thread, small-object etc. etc.
> In my experience it's very hard to beat. The problem with custom alloc
> is:
> - it's more code to maintain, it's hard to get right.
> - it makes tools like valgrind much less useful.

We don't have any benchmark data for this.  It would be good to find out
how it actually performs.

This allocation mechanism comes from our experience with the C++ broker
and problems with the general-purpose heap manager in Linux.  Because a
broker/router will very frequently allocate memory on one thread and
free it on another, there is either lock contention or significant
over-allocation.  We need to disable the "arena" optimizations in malloc
to prevent memory exhaustion.

The alloc.c module provides per-thread free-pools with limited size
thereby amortizing the locking overhead for many operations into a
single rebalancing of memory between threads (i.e. a lock is not taken
out during alloc/free unless there is a need to rebalance a block of
memory into or out of the thread pool).

Dispatch uses alloc.c in select cases, where rapid and frequent
allocation is expected.  The general heap or static allocation is used
everywhere else.

> 
> qd_buffer_set_size/qd_buffer is not thread safe. We don't seem to use
> set_size,
> we should get rid of it if we don't need it or make it threads safe if
> we do.

Thanks for pointing this out.  I'll take a look at it.

> 
> router_semantics_for_addr: linear search for address. This is not on the
> critical path for messages but it is on the path for subscriptions and
> we've
> seen situations in the past where creating subscriptions can be critical
> too.

You are correct about this.  The linear search was done for expediency.
 This should be replaced by a higher performance index.

-Ted

> 
> Thanks, and apologies if I've missed anything obvious.
> 
> Cheers,
> Alan.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
> For additional commands, e-mail: dev-help@qpid.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org