You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "William A. Rowe, Jr." <wr...@lnd.com> on 2000/07/25 03:55:54 UTC

Not [PATCH] Filter registration :)

And you suggest there are no issues with filtering + async (LOL)

> From: Greg Stein [mailto:gstein@lyra.org]
> Sent: Monday, July 24, 2000 8:47 PM
>
> If a bucket has a self-defined lifetime, then the ap_bucket_t must be
> allocated from the heap/pool. Usually, its data must also be managed in a
> way that is separate from the execution stack (e.g. rmem buckets have issues
> in that they can refer to stack-based data; that prevents their self-
> defined lifetime).

I actually found your #3 - pipe read is a terminal condition that can't be
undone, to be the most significant issue.

By definition, when and if we transition to async buckets, nothing can be
living on the stack.  Not that we can't pass the bucket from filter to
filter as arguments, but it needs to live in the pool.  And, as far as 
the lifetime is concerned, I suggest they live for the duration of their 
request, unless discarded (read on :)

Is there any thought to fixed-size buckets (8k, for example) in a common
pool for the process (talking about a threaded model, here, or perhaps
in shared memory across processes) that will live for eternity?  Simply 
grab and discard as the filters pass them around.  If I must rewrite a 
bucket (can't just tweak it), then grab another bucket and throw the 
last back to the pool.  If I need a second bucket (growing the response), 
then I need to just grab one from the pool.  All buckets must be thrown 
back into the available pool by Apache at the end of the request.

I don't like relying on the module writer to do-it-right when we will be
dying a slow, painful death due to a module's leak.

The advantage lies in the fact that shared file memory can be swapped in
and out, the bucket list can grow if necessary, and could even be shrunk
if we always allocate from the head (so once that ornery huge request is
done, and odd leftover buckets are released, noone is sitting at the end
of the pool.)

Thoughts?

Bill


Re: Not [PATCH] Filter registration :)

Posted by Greg Stein <gs...@lyra.org>.
On Mon, Jul 24, 2000 at 08:55:54PM -0500, William A. Rowe, Jr. wrote:
> And you suggest there are no issues with filtering + async (LOL)

Never said that. I said that it is inappropriate to use async as a reason
for preventing filtering work. Async will have an impact on *everything*.
Since we must rewrite everything to deal with async, then any filtering we
do now just falls into that "rewrite it all" work item. And while we may be
able to fine tune a bit of the filtering for the async model, none of the
designs on the table have a great match for a "pure" async implementation.
IOW, we will end up with a hybrid async/sync model.

And no: I'm not up for discussing async now. We aren't go to do it any time
soon, so why spend time talking about it now?

> > From: Greg Stein [mailto:gstein@lyra.org]
> > Sent: Monday, July 24, 2000 8:47 PM
> >
> > If a bucket has a self-defined lifetime, then the ap_bucket_t must be
> > allocated from the heap/pool. Usually, its data must also be managed in a
> > way that is separate from the execution stack (e.g. rmem buckets have issues
> > in that they can refer to stack-based data; that prevents their self-
> > defined lifetime).
> 
> I actually found your #3 - pipe read is a terminal condition that can't be
> undone, to be the most significant issue.
> 
> By definition, when and if we transition to async buckets, nothing can be
> living on the stack.

"... living on the stack [at the time you pass the buckets to the async
output manager]." In other words, you copy the data off the stack at that
point.

Some background for terminology:

    {
        char buf[100];
	int n;
	
	n = fill_buf_with_stuff(buf, sizeof(buf));
	ap_rwrite(buf, n, r);
    }

Let's refer to the data residing in "buf" as "stack-based data".

If the stack-based data is going to be translated, then there will never be
a need to copy it (the original data is never passed, just the translated
data). If the stack-based data needs to live past the function return, then
it must be copied. So... to possibly optimize the number of copies, it is
important to allow stack-based data to exist within the output chain. At the
point that the lifetime of that data needs to change (if ever!), then it can
be copied.

[ there is an argument that we should never pass stack-based data into the
  filter chain. I believe that is a poor decision, but I'm hoping that I
  don't have to illustrate why... ]

> Not that we can't pass the bucket from filter to
> filter as arguments, but it needs to live in the pool.  And, as far as 
> the lifetime is concerned, I suggest they live for the duration of their 
> request, unless discarded (read on :)

Actually, the data would need to live for the duration of the *connection*.
It will be nice to jam two responses into one network packet. Thus, the data
going into that network packet must survive the finalization of the first
request.

> Is there any thought to fixed-size buckets (8k, for example) in a common
> pool for the process (talking about a threaded model, here, or perhaps
> in shared memory across processes) that will live for eternity?  Simply 
> grab and discard as the filters pass them around.  If I must rewrite a 
> bucket (can't just tweak it), then grab another bucket and throw the 
> last back to the pool.  If I need a second bucket (growing the response), 
> then I need to just grab one from the pool.  All buckets must be thrown 
> back into the available pool by Apache at the end of the request.

Nope. No thoughts have been applied in this area. You are effectively
describing pools -- pools keep a "free list" of blocks to use for future
allocations. Of course, how this interacts with pools... dunno. I'm not sure
what the design's goal is.

> I don't like relying on the module writer to do-it-right when we will be
> dying a slow, painful death due to a module's leak.

Any time you allocate, it is possible that you may leak. We avoid those
through the use of pools. If your bucket structures (not necessarily the
data!) are defined on the stack, then you can't possibly leak them. The
data referenced by the bucket has the same lifetime, so it is also quite
easy to manage it.

It only gets complicated when somebody wants to "set aside" data. The other
complication is at the bottom of the filter chain. What is the "right"
mechanism for buffering up network output to minimize the number of packets?

> The advantage lies in the fact that shared file memory can be swapped in
> and out, the bucket list can grow if necessary, and could even be shrunk
> if we always allocate from the head (so once that ornery huge request is
> done, and odd leftover buckets are released, noone is sitting at the end
> of the pool.)
> 
> Thoughts?

I don't quite follow your description. I don't quite see the benefits over
using pools for the memory management. Care to code up something to
demonstrate the idea? :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/