You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Greg Stein <gs...@lyra.org> on 2000/06/27 12:39:45 UTC

what are the issues? (was: Re: Patch review: ...)

On Mon, Jun 26, 2000 at 09:39:43PM -0700, rbb@covalent.net wrote:
>...
> We are definately making more progress.

Agreed.

> I am beginning to think though
> that we aren't making much more progress.  I need to step back from this
> for a while.  I am -1 for putting ANY patch in at all right now.

I can wait, but I don't have a clear understanding on your issues with my
patch. I also don't want to see us try to solve ALL the filtering problems
in one fell swoop. We should take a step at a time.

> I am
> also beginning to think that we may need to re-implement some pool
> functions in order to do this at all properly.

*blink*  Eh? I don't see this, but I'm sure you will demonstrate :-)

I'll also say that I believe my patches empirically show that pool changes
aren't needed to move us that first step.

>...
> I am -1 for using Greg's patch for all the reasons I dislike my patch.

I have three review points from you so far, each are soluable. I would like
to hear from you whether these don't seem solved, or whether you see other
problems:

1) need to delay issuing headers (the r->headers_sent thing in your patch)

   SOLUTION: Simple addition, solved in my scheme the same way you did. I
             deferred simply because I knew it was disjoint and did not have
             to be solved in this first step.

   RATIONALE: When a content-generator calls ap_rwrite(), they have already
              given up hope on modifying the headers. We will not be
              changing the status quo.

	      What it *does* mean, however, is that the filters themselves
	      cannot change the headers during their content processing
	      (they certainly can at other points in the request
	      processing). I consider this an additional feature, for a
	      later step.

2) Should not use char* callbacks.

   SOLUTION: I disagree this is a problem. It is a stylistic issue, but has
             no technical problems that are not present with buckets or
             ioblock -based callbacks.

   RATIONALE: If a filter callback is passed a bucket/ioblock and it needs a
	      char* to examine the actual content, then it must call a
	      utility function right away to get that char*:
	      
	      my_callback(bucket):
	          str = give_me_string(bucket)
		  process_string(str)

              The simple callback form alters this to:
	      
	      str = give_me_string(bucket)
	      my_callback(str):
	          process_string(str)


              By doing the "give me a string" in the framework, we have more
	      opportunity to optimize the string's construction and
	      handling. For example, create and destroy a pool to hold the
	      string while the callback is invoked. In the "build the string
	      inside" approach, the callback would look like:
	      
	      my_callback(bucket):
	          pool = create_pool(bucket->r->pool)
		  str = give_me_string(bucket)
		  process_string_str)
		  destroy_pool(pool)

              or:
	      
	      my_callback(bucket):
		  str = give_me_string(bucket, &pool)
		  process_string_str)
		  destroy_pool(pool)

              For the simple callback, the pool handling occurs *outside*
	      the callback. It remains just as simple as above. The point
	      here is that we have optimization choices that will not affect
	      the simple_cb style of filter.

3) ordering issues

   SOLUTION: Per a previous email, I added a condition to the scan loop in
	     ap_add_filter(). I had guessed that would eventually be added,
	     but wanted to defer it until it was found to be really
	     necessary.

   RATIONALE: not needed :-) ... this has been addressed explicitly.


Other problems?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

char* handler (was: what are the issues?)

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Jun 29, 2000 at 11:30:51PM -0700, Roy T. Fielding wrote:
> >Point is: the char* callback does exactly what an ioblock/bucket callback
> >would do on its own when it must examine each byte.
> >
> >So, I will state again: the char* callback is not a problem. If you
> >disagree, then please explain further.
> 
> There is a significant flaw in that argument.  char * doesn't do what we
> want when a filter does not have to examine each byte.  That is the problem.
> 
> It doesn't make any sense to have two filter interfaces when you can
> accomplish the same with one and a simple parameter conversion function.

The char* handler is not suited for all filters. Granted. I've maintained
that it is simply a convenience for those filters that don't want to munge
through the bucket interface.

Consider the case where you need to crawl through all the bytes of the
content delivered into your filter (gzip, recoding, SSI, PHP, etc). Now
let's take the bucket-based interface from my patch:

  my_callback(filter, bucket)
  {
      ... how to process each character in the bucket? ...
  }

The processing gets a bit hairy, and it would be contained in all of the
each-char-walking filters. I believe Jim Winstead said something like:

    p = bucket->get_data()

Unfortunately, that isn't quite enough. If the bucket represents a file,
then you can't simply get a pointer to it (don't want to read it all into
memory, and maybe mmap is not present on the platform). This implies that
you will have a read-loop in your callback:

    read_context = prep_read_content(bucket);
    while (1) {
        p = bucket->get_more_data(read_context);

        ... process p ...
    }

Now, we can't just keep reading chunks of data from that flie endlessly
(into the heap), so we need some kind of buffer management:

    if (ctx->read_buf == NULL)
        ctx->read_buf = ap_palloc(some_pool, READ_BUF_SIZE);
    read_context = prep_read_context(bucket, ctx->read_buf);
    ...

All right.. Maybe that is "okay" for people to do in each filter, and works
for the file case.

Uh oh... what happens when somebody calls ap_rprintf()? If we attempt to
delay the actual formatting as late as possible (so that BUFF can do it
directly into the network buffer), then we pass around a fmt/va_list pair.
Our issue here is to reduce that to characters for scanning. We can use
ap_vsnprintf() to drop it into ctx->read_buf, but what happens on overflow?
Now we need to allocate a big enough block from somewhere, format into it
using ap_pvsprintf(), and then toss it out. The only "toss" that we have
right now is destroying a pool.

My intent was to simplify the job for filters. When they don't want to do
this work, they use a char* handler and get plain old bytes. Simple, clean,
and easy.

When filters can be smarter and work with files or bytes, or whatever, then
they can use the bucket-based callback. Should we encourage everybody to use
the bucket interface? Probably. Does this encouragement change the patch
that is submitted? I don't think so.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

buckets vs strings (was: what are the issues?)

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Jun 29, 2000 at 11:30:51PM -0700, Roy T. Fielding wrote:
>...
> The point here isn't that this is the only way to implement filters.
> The point is that no other interface can implement them as efficiently.
> Not even close.  Yes, there are cases where string filters are just as
> efficient as any other design, but there is no case in which they are
> more efficient than bucket brigades.

Um. I think it is important to clarify that string-filters are not the only
option in my patch (dunno if you knew this). I *do* have a bucket interface,
and it can process things in the fashion you describe.

I do not have lists of buckets, but that is merely adding a "next" pointer.

The intent is to provide a small, tight, reviewable framework that allows
for the growth into completeness. Want a list of buckets? A "next" pointer
is easy. Want an "End of Stream" bucket? Another bucket type. Want a bucket
that carries its own data to allow for arbitrary lifetimes? Another bucket
type (coded that example last night).

Starting small and lean helps us to understand what is going in. The patch
that I posted for committing is a simple 850 lines. A continuing sequence of
development easily brings us to the position that you are looking for, Roy.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: what are the issues? (was: Re: Patch review: ...)

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
rbb@covalent.net wrote:
> 
> I agree 100%.  We have definately stopped yammering and starting
> listening and working together.

Maybe that's how it seems to you guys, but not to me. :-)

> I have reviewed the patches, and done code inspections.

Have you applied the patches and tried them out?  I don't
see how you guys can be disagreeing on the code's behaviour
if you've both actually *tried* it.  I just don't get it.

> The problem here, is that Greg and I disagree on the basic design.
> I don't think either one of us has any doubts that the code posted
> does what it says it does.  The problem is that the designs are
> VERY different.

Then please rescind your veto and let the normal 'three +1' RTC
method work.  When three people have tried it and said that It
Is Good, that should be sufficient.  If the code does what it's
supposed to do, it would seem that the issues that caused the
veto have been addressed.
-- 
#ken    P-)}

Ken Coar                    <http://Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Apache-Server.Com/>
"Apache Server Unleashed"   <http://ApacheUnleashed.Com/>

Re: what are the issues? (was: Re: Patch review: ...)

Posted by rb...@covalent.net.
> We're past the yammering, and are working on the "challenge problem" now :-)

I agree 100%.  We have definately stopped yammering and starting listening
and working together.

> My issues with Ryan's patches have been from code inspections. Each issue
> was derivable from that standpoint alone.

I have reviewed the patches, and done code inspections.

The problem here, is that Greg and I disagree on the basic design.  I
don't think either one of us has any doubts that the code posted does what
it says it does.  The problem is that the designs are VERY different.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------


Re: what are the issues? (was: Re: Patch review: ...)

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Jun 29, 2000 at 02:09:26PM -0400, Rodent of Unusual Size wrote:
> Greg, have you actually *tried* Ryan's patch(es)?
> Ryan, have you actually *tried* Greg's patch(es)?
> If the answer to either is 'no,' then please stop the
> theoretical handwaving and DO IT.  Then see if your
> theories are supported by the evidence.  This is
> is an aspect of the 'scientific method,' which I
> submit is more appropriate here than Philosophy 101.
> 
> And no, I haven't [yet] tried *either* patch.  But then
> neither am I yelling that either of them is broken.
> 
> This yammering is really starting to get on my nerves.

We're past the yammering, and are working on the "challenge problem" now :-)

My issues with Ryan's patches have been from code inspections. Each issue
was derivable from that standpoint alone.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: what are the issues? (was: Re: Patch review: ...)

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Greg, have you actually *tried* Ryan's patch(es)?
Ryan, have you actually *tried* Greg's patch(es)?
If the answer to either is 'no,' then please stop the
theoretical handwaving and DO IT.  Then see if your
theories are supported by the evidence.  This is
is an aspect of the 'scientific method,' which I
submit is more appropriate here than Philosophy 101.

And no, I haven't [yet] tried *either* patch.  But then
neither am I yelling that either of them is broken.

This yammering is really starting to get on my nerves.
-- 
#ken    P-)}

Ken Coar                    <http://Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Apache-Server.Com/>
"Apache Server Unleashed"   <http://ApacheUnleashed.Com/>

Re: what are the issues? (was: Re: Patch review: ...)

Posted by Greg Stein <gs...@lyra.org>.
On Tue, Jun 27, 2000 at 08:19:16AM -0700, rbb@covalent.net wrote:
>...
> > I can wait, but I don't have a clear understanding on your issues with my
> > patch. I also don't want to see us try to solve ALL the filtering problems
> > in one fell swoop. We should take a step at a time.
> 
> The issues are intertwined.  You can't just lop off part of the problem
> and try to solve it that way.  We need a design that incorporates
> everything.  Part of waht scares me with your design is that you do keep
> leaving the small stuff off.  This makes the patch incomplete.

Please specify which features you believe have been lopped off and are
required for patch acceptance. This is too ambiguous.

>...
> > > also beginning to think that we may need to re-implement some pool
> > > functions in order to do this at all properly.
> > 
> > *blink*  Eh? I don't see this, but I'm sure you will demonstrate :-)
> > 
> > I'll also say that I believe my patches empirically show that pool changes
> > aren't needed to move us that first step.
> 
> It's not the first step that's important here.  If you take the first
> step, and tie ourselves to pools, then the second step is not possible.

I am assuming by "tie ourselves to pools" you are referring to the following
paragraphs. See below for my response -- we are not tied, and your
suppositions are incorrect.

> One of the things you've consistently said about my patch, is that
> everything is allocated on the heap, and that means things stick around
> until the end of the request.  That's true.  The fact is though, that the
> same is true of your patch, once we create complicated filters.

No, I said that about your original hook strategy. This is not true in
either of the recursive strategies.

The original content, its intermediate filtered content, and the final
content do not have to be present on the heap, in full. I never expect them
to be.

What *will* happen is that filters will allocate working buffers on the
heap. These will be reused during the request processing, but will be a
fixed amount of memory -- not growing with the size of the response.

I can provide a proof-by-example if you do not believe me.

> Yes, this behavior is not seen in mod_gargle or any of the examples we
> have coded so far.  It is there in some of the examples that we have
> ignored (both of us).  For example, take real-time color quantization
> (Kevin Kiley's example), which is then gzip compressed.

I am not familiar with the color quantization algorithm for JPEGs. I don't
know if that will require the entire picture to be in memory at once. For
GIFs, I believe I'm familiar with the structure of the file, and it will
support streaming quantization.

GZIP is a streaming algorithm and can be implemented using a working buffer.

> In the link-based scheme, because lower filters have no way of knowing
> where the data they are using comes from, they have to memcopy
> it.  Without changes to the pool structure (or ignoring pools), this
> requires allocating a buffer, and just sticking all the data in the buffer
> until you have soaked up all the data.  Then, the data can be passed to
> the gzip filter, which also has to memcopy it (could be an mmap'ed
> file) off to the side.

This is patently false. A fixed-size working buffer can be used. The working
set does not increase with the size of the response.

Again, I can prove this by example if you do not believe me.

> Now, you have two full copies of the data sitting on the heap, neither of
> which will be freed until the request is done.  And, there is no way to
> free them, because the pool used is the same as the pool the request_rec
> comes from.

False.

> Now, make the image that we are doing this to REALLY big, like 30 
> MB.  Now, you have the original 30 MB that was allocated (most likely
> mmap'ed), plus the 30MB that was allocated for the quantifying layer, plus
> the 30 MB that was allocated for the gzip layer.

False.

> What we need is exactly what Roy and Dean have been proposing, which is
> buckets.  Not the buckets you have implemented or the buckets I have
> implemented, because they don't go far enough.

I disagree. You have not shown this to be true.

> > > I am -1 for using Greg's patch for all the reasons I dislike my patch.
> > 
> > I have three review points from you so far, each are soluable. I would like
> > to hear from you whether these don't seem solved, or whether you see other
> > problems:

[ Point 1 (delay issuing the HTTP headers) was not responded to in Ryan's
  response, so I must presume that it is no longer an open issue.
  
  Please see my original note for more detail on Point 1. ]

> > 2) Should not use char* callbacks.
> > 
> >    SOLUTION: I disagree this is a problem. It is a stylistic issue, but has
> >              no technical problems that are not present with buckets or
> >              ioblock -based callbacks.
> 
> It does have drawbacks and problems that aren't there in the other
> cases.  The other cases allow for single-copy writes.  This
> doesn't.  Because the other design allows people to associate everything
> about the bucket with the bucket.

The bucket/ioblock case allows a single-copy write *IFF* the filter does not
have to examine each byte of content. *IF* the filter requires that
examination, then there is no way that a bucket/ioblock solution can
possibly manage a single-copy write.

Consider the four bucket types (the two in an ioblock are a subet) that I
have proposed:

1) AP_BUCKET_PTRLEN: the filter examines each byte as it passes through. No
   additional copying is required since the bytes are immediately available.

2) AP_BUCKET_FILE: the file must be mapped into memory, or read into memory
   if mmap is not available. Bam. This is not single-copy.

3) AP_BUCKET_STRINGS: this is a va_list of (const char *) pointers. The
   filter can use va_start/va_end to walk through them until the NULL
   pointer is hit. Since the bytes are immediately available, there is no
   copying available.

4) AP_BUCKET_PRINTF: this is a fmt/va_list pair. Strictly speaking, it might
   be possible for a filter to use a vformatter to walk through the thing
   without copying any bytes. Nobody will code this, however (even BUFF
   doesn't try this). Therefore, it will be formatted into some kind of
   buffer (on the stack or heap). Bam. This is not single-copy.

The algorithms listed above for each type is exactly what the framework does
when a char* callback is used. Therefore, it optimizes (down) the number of
copies required.

Point is: the char* callback does exactly what an ioblock/bucket callback
would do on its own when it must examine each byte.

So, I will state again: the char* callback is not a problem. If you
disagree, then please explain further.

> The problem is that we are hiding what we are doing from the programmer,
> so they can't make good decisions.  By telling the programmer operate on
> this string, and I'll give you more later, you are crippling what they
> can do.  (example in next section using mod_autoindex.)

1) if the programmer does not want us to map buckets to a char* for them,
   then they can write a bucket callback. But char-by-char filters will want
   to use the char* style.

2) no matter *WHAT* we do, it will always be the case that we will give them
   chunks of data. There is nothing that can avoid that, and you cannot
   claim that the char* callback promotes that behavior.

Summary: there is no crippling or hiding happening here. I fail to see your
point.

> >    RATIONALE: If a filter callback is passed a bucket/ioblock and it needs a
> > 	      char* to examine the actual content, then it must call a
> > 	      utility function right away to get that char*:
> > 	      
> > 	      my_callback(bucket):
> > 	          str = give_me_string(bucket)
> > 		  process_string(str)
> > 
> >               The simple callback form alters this to:
> > 	      
> > 	      str = give_me_string(bucket)
> > 	      my_callback(str):
> > 	          process_string(str)
> 
> But, the design encourages people to not do this well.

Wait just a minute here.

Which design are you referring to? The filter design, or the char* callback
design?

If you believe that the char* callback encourages people to write bit-by-bit
content generators, then you are mistaken. The two processes/algorithms are
*TOTALLY* disconnected. The content generator does its thing. The filter
does its thing. Nothing that either one does has any impact on how the other
is coded/designed/built.

> Let me give you an
> example.  Mod_autoindex.c:
> 
> Currently, mod_autoindex calls ap_rputs with a bunch of very small
> strings.  If we pass down strings, then we are essentially telling module
> writers that this is a valid way to write data.

Module writers will implement their modules this way. Simple fact.

Disconnected from that, how we pass the content around is a different story.

1) Most of mod_autoindex's calls are ap_rputs() and ap_rvputs(). They
   already *are* strings. Why not pass them as such?

2) mod_autoindex has a couple ap_rprintf() calls. For a filter to process
   those bytes, the fmt/args must be mapped into a char*. Somebody must do
   that: the content generator, the filter framework, or the filter. I took
   the position: don't affect how content generators implement their code,
   and simplify the process for the filter writers. Therefore, it goes in
   the framework.

I fail to see how the char* callback can possibly encourage any particular
style of writing a content generator.

> This will increase our
> memory footpring.  For example:   mod_auto->index  ->  mod_digest.

False.

As I've stated before, there are no allocations in my filter framework that
scale with the size of the response.

> Mod_autoindex will write little strings a bunch of times, and mod_digest
> will cache them all (on the heap, from the pool).  When mod_digest has
> seen all of the data, it will calculate the md5sum, and then pass it on to
> the next layer.

False.

mod_auth_digest will compute the digest on a streaming basis. It will cache
nothing. When the trailer headers are delivered, the digest value goes out
at that time.

If chunking is not used, then the digester would have to spool the response
somewhere (mem or disk).

Regardless, this does not invalidate the char* callback style. I see no
connection.

> A much better way to write this, is to have mod_Autoindex allocate a big
> block of data (on the heap, not using the pool, using malloc), which
> stores all of the data that mod_autoindex can grab.  This requires that
> mod_autoindex is re-written to do some of it's own buffering.  Then, that
> data is passed as one big chunk to mod_digest, which doesn't need to copy
> it at all.  Mod_digest knows the data is modifiable, because the bucket
> tells it so.  Mod_digest also knows that if it gets too much data, it can
> write the data out to the disk, and just start freeing memory, because the
> memory is associated with the bucket, not the pool.  Later, when
> mod_Autoindex has received all the data, it knows if anything has been
> written to disk, and can calculate the md5sum.

You are suggesting an architectural change to how content generators work.
That is out of scope.

Want to bring it into scope? Fine. I'll state that my framework supports
your suggestion.

1) mod_autoindex must use a new ap_rpass_buffer() since the current API does
   not support any notion of "passing responsibility" for a buffer of
   output.

2) The framework drops this into a new ap_bucket_type and calls
   mod_auth_digest. The latter sets it aside and/or spools to disk as you
   suggest.
   [ unless, of course, chunking is used; the digest would use a trailer
     header in that case and avoid the set-aside notion ]

3) When the framework sends the "flush/end" through, mod_auth_digest adds
   the digest value to the headers, pulls the data back, and sends it down
   to the next filter.

The above shows that my suggested filter framework *CAN* deal with
mod_auth_digest as you suggest. It can also operate in the presence of an
updated mod_autoindex.

"But wait!" you say. The char* obviates the above. Of course.
mod_auth_digest requires that the entire response is sometimes set-aside
because of the non-chunking/need-digest issue. It needs to see the whole
response before allowing the headers to go out.

Many filters can operate in a simple streaming mode and do not require this
huge set-aside.

Either way: my framework supports this concept. This would be in a future
change set. The changes are:

1) define ap_rpass_buffer()
2) define AP_BUCKET_PASS_BUFFER
3) update mod_auth_digest appropriately
4) update ALL(!) content generators to stop using ap_r*() and force them to
   use only ap_rpass_buffer()

These changes would also be preconditioned on the delayed-header output. I
believe we both agree that will happen in the short-term (coincident with a
patch, or the first thing after that).

> Also, having both buckets and char *'s opens us up to bugs.  One of the
> incredibly nice things about the buckets, is that they allow programmers
> to group the data.  This means having lists of buckets.  In the code you
> have posted, if I create a list of buckets, and then I try to pass that to
> a char * filter, I will only get the first bucket.  The rest are
> forgotten.

Totally, impossibly, false. Complete FUD.

1) there is nothing in my framework for lists of buckets
2) presume there is a new put_buckets() API for sending a list of buckets
3) put_buckets() would iterate over the buckets, map them into a char*, and
   call the callback for each one.

This is the kind of crap that I get really pissed off about. When I hear Ken
thinking that my patch isn't ready for review "because Ryan keeps dissing
it", and I look at the above... no fuggin wonder he thinks that.

> (I noticed this yesterday, but I was busy hacking my patch,
> and then I was honestly pissed off, and I forgot to mention it).  The
> ability to string buckets together is where this feature really pays off.

A list of buckets can certainly be nice. The char* callback does nothing to
prevent that feature from being added at a later time if/when we discover
that it is important.

> >               By doing the "give me a string" in the framework, we have more
> > 	      opportunity to optimize the string's construction and
> > 	      handling. For example, create and destroy a pool to hold the
> > 	      string while the callback is invoked. In the "build the string
> > 	      inside" approach, the callback would look like:
> 
> If we associate the memory with a pool, we not only have to keep track of
> that pool, but also the pool's parent pool, and it's parent's parent pool,
> etc.  Once one pool is destroyed, all of the child pools are
> destroyed.  This means that filters will have to either have A LOT of
> information about the lifetime of pools, or all of the data lives as long
> as the request.

What the hell?

Go take a look at my patch. The filters do ZERO pool management. The
ap_lvprintf() function creates and destroys a pool inline. The pool never
survives past the function return. Simple and sweet.

Possibly your above statements were made in regards to the "build the string
inside" approach, combined with pools. Okay, let's look at that code:

> > 	      my_callback(bucket):
> > 	          pool = create_pool(bucket->r->pool)
> > 		  str = give_me_string(bucket)
> > 		  process_string_str)
> > 		  destroy_pool(pool)

I don't see that pool having any lifetime issues. When the callback returns,
the pool is gone.

What are you talking about, when you say we must track the pools?

> I thought of this, and it is what I was getting at when I was talking
> about sibling pools, because it allows two filters to operate on two
> different pools.  It also requires that each filter copy all of the data
> into it's own pool before it can do anything.  OTherwise, it doesn't know
> what the previous layers will do with their pools.
> 
> Look at it this way:
> 
>      my_callback(bucket)
>           p = create_pool(bucket->r->pool);
>           str = give_me_string(Bucket)
>           process_string(Str)
>                  [which calls the next layer, which doesn't have enough
>                   data, and it moves the string off to the side.]
>           destroy_pool(pool)
>                  [The string is gone now, so the next layer had better
>                   have made a copy.  If each layer does this, we end up
>                   with a lot more data on the heap than we want. ]

1) the current ap_rwrite() behavior does not allow the inside to depend on
   the lifetime of the passed ptr/len pair.
2) my framework takes a similar approach w.r.t. arguments passed into the
   filter callbacks.
3) if a filter is going to set aside a value, then yes: it must make a copy
   because it cannot be guaranteed of the argument's lifetime.

FUTURE STEP:
4) if you create a new bucket type that carries its own pool, then we can be
   smarter.

Nothing in the current framework is obviating that future step.

[ and nothing in the current framework, or the proposed new bucket type
  would require any kinds of changes in our pool design. ]

> >               For the simple callback, the pool handling occurs *outside*
> > 	      the callback. It remains just as simple as above. The point
> > 	      here is that we have optimization choices that will not affect
> > 	      the simple_cb style of filter.
> 
> The point is that this looks like an optimization, but it is really not
> one.  In reality, this design (as well as any other that requires
> pools) requires more memory and pool memcpy's than are necessary.

False. See above.

The framework retains no additional memory to deal with the char* callback.
As always, the filter is free to consume memory, but there is nothing in the
design that encourages unlimited consumption.

A memcpy() is only needed if data must be set aside. This occurs in all cases.

A future solution can pass responsibility for a bucket down through the
layers. But that is an obvious future change, and is not obviated by the
current framework.

> >    SOLUTION: Per a previous email, I added a condition to the scan loop in
> > 	     ap_add_filter(). I had guessed that would eventually be added,
> > 	     but wanted to defer it until it was found to be really
> > 	     necessary.
> 
> I need to look at that still.

Please do. Until you say otherwise, I'll consider issue (3) closed.

> > Other problems?
> 
> The content-length thing is a real issue.  You can say the spec allows us
> to drop the content-length, but two people have said that isn't an
> option.  Kevin Kiley and Roy.  I'll be number three.  If we are not just
> going to drop the content-length, then we need to allow filters to keep
> the whole request around (but not in memory).

I provided a detailed response to this for Roy. The only place where we can
set the content-length is in the BUFF code (or at least in a "last" filter
right before calling BUFF). BUFF makes the most sense since it buffers up
the response already. If the response fits in that buffer: cool, we have a
content-length. If it does not, then a choice needs to be made: do we punt
on the content-length, or do we spool the whole response to the side until
it is completed?

Any of these options are not obviated by my patch. They are future,
additional steps that can be taken to provide more capability when filtering
is present.

Let's not forget: Content-Length is going to be present when the patches are
committed. It is *only* when we start experimenting with various filters
that we will run into this issue. Whether we check in code to deal with
Content-Length before or after that point, is up to us.

But nothing stops us now.

> This means being able to
> flush the data to disk, and free up the memory.  With all of the current
> patches, we are using pools.

My patches are not using pools. My patches provide a framework. If the
filters want to use pools or a file or something else... fine, let them.

> This means that we have one of two options:
> 
> 1)  Each layer creates it's own sub-pool.  In this case, dumping to the
> disk doesn't really do us any good, because all of the previous layers
> have copies of the data lying around.
> 
> 2)  We use one pool.  In this case, we can't free the pool, because we
> don't know who else is using it.  This means that while all of the memory
> has been flushed to disk, we can't use any of the now available memory (a
> previous layer could have gotten there before us).

I have provided other solutions to the Content-Length problem. The above are
poor, as you point out. I don't believe that is true for my solution (buffer
into BUFF, or spill when it becomes larger).

> I really hope that outlines and explains my issues with this patch.

I am going to post a new note that summarizes the points you raised, and a
short summary of my responses. The net effect is that you haven't made any
points that would obviate applying the patch today. Any issues that you have
can be built upon this patch after it has been checked in.

>...
> BTW, I tend to belive (although I have no hard proof, it is a
> very strong feeling) that sub-requests won't and can't work with any of
> the current models.  Any sub-request that generates data (all of them),
> needs to pass that data back to the original request to go through the
> rest of the filters.  This finally struck me this morning.  I'm trying to
> figure out how to express why this is so, but the words aren't coming
> right now.  I'll try again later.

I have empirical proof that subrequests work in my framework.

If you would prefer that the subrequest goes through the original request's
filter chain, then simply comment out line 198 in my mod_gargle example.

I see no issues with subrequests from a *technical* standpoint. There is
certainly a policy standpoint about which filters their content should flow
through. But I've already shown that policy can be set at will.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: what are the issues? (was: Re: Patch review: ...)

Posted by rb...@covalent.net.
> > I am beginning to think though
> > that we aren't making much more progress.  I need to step back from this
> > for a while.  I am -1 for putting ANY patch in at all right now.
> 
> I can wait, but I don't have a clear understanding on your issues with my
> patch. I also don't want to see us try to solve ALL the filtering problems
> in one fell swoop. We should take a step at a time.

The issues are intertwined.  You can't just lop off part of the problem
and try to solve it that way.  We need a design that incorporates
everything.  Part of waht scares me with your design is that you do keep
leaving the small stuff off.  This makes the patch incomplete.

> > also beginning to think that we may need to re-implement some pool
> > functions in order to do this at all properly.
> 
> *blink*  Eh? I don't see this, but I'm sure you will demonstrate :-)
> 
> I'll also say that I believe my patches empirically show that pool changes
> aren't needed to move us that first step.

It's not the first step that's important here.  If you take the first
step, and tie ourselves to pools, then the second step is not possible.

One of the things you've consistently said about my patch, is that
everything is allocated on the heap, and that means things stick around
until the end of the request.  That's true.  The fact is though, that the
same is true of your patch, once we create complicated filters.

Yes, this behavior is not seen in mod_gargle or any of the examples we
have coded so far.  It is there in some of the examples that we have
ignored (both of us).  For example, take real-time color quantization
(Kevin Kiley's example), which is then gzip compressed.

In the link-based scheme, because lower filters have no way of knowing
where the data they are using comes from, they have to memcopy
it.  Without changes to the pool structure (or ignoring pools), this
requires allocating a buffer, and just sticking all the data in the buffer
until you have soaked up all the data.  Then, the data can be passed to
the gzip filter, which also has to memcopy it (could be an mmap'ed
file) off to the side.

Now, you have two full copies of the data sitting on the heap, neither of
which will be freed until the request is done.  And, there is no way to
free them, because the pool used is the same as the pool the request_rec
comes from.

Now, make the image that we are doing this to REALLY big, like 30 
MB.  Now, you have the original 30 MB that was allocated (most likely
mmap'ed), plus the 30MB that was allocated for the quantifying layer, plus
the 30 MB that was allocated for the gzip layer.

What we need is exactly what Roy and Dean have been proposing, which is
buckets.  Not the buckets you have implemented or the buckets I have
implemented, because they don't go far enough.

> > I am -1 for using Greg's patch for all the reasons I dislike my patch.
> 
> I have three review points from you so far, each are soluable. I would like
> to hear from you whether these don't seem solved, or whether you see other
> problems:

> 2) Should not use char* callbacks.
> 
>    SOLUTION: I disagree this is a problem. It is a stylistic issue, but has
>              no technical problems that are not present with buckets or
>              ioblock -based callbacks.

It does have drawbacks and problems that aren't there in the other
cases.  The other cases allow for single-copy writes.  This
doesn't.  Because the other design allows people to associate everything
about the bucket with the bucket.

The problem is that we are hiding what we are doing from the programmer,
so they can't make good decisions.  By telling the programmer operate on
this string, and I'll give you more later, you are crippling what they
can do.  (example in next section using mod_autoindex.)

>    RATIONALE: If a filter callback is passed a bucket/ioblock and it needs a
> 	      char* to examine the actual content, then it must call a
> 	      utility function right away to get that char*:
> 	      
> 	      my_callback(bucket):
> 	          str = give_me_string(bucket)
> 		  process_string(str)
> 
>               The simple callback form alters this to:
> 	      
> 	      str = give_me_string(bucket)
> 	      my_callback(str):
> 	          process_string(str)

But, the design encourages people to not do this well.  Let me give you an
example.  Mod_autoindex.c:

Currently, mod_autoindex calls ap_rputs with a bunch of very small
strings.  If we pass down strings, then we are essentially telling module
writers that this is a valid way to write data.  This will increase our
memory footpring.  For example:   mod_auto->index  ->  mod_digest.

Mod_autoindex will write little strings a bunch of times, and mod_digest
will cache them all (on the heap, from the pool).  When mod_digest has
seen all of the data, it will calculate the md5sum, and then pass it on to
the next layer.

A much better way to write this, is to have mod_Autoindex allocate a big
block of data (on the heap, not using the pool, using malloc), which
stores all of the data that mod_autoindex can grab.  This requires that
mod_autoindex is re-written to do some of it's own buffering.  Then, that
data is passed as one big chunk to mod_digest, which doesn't need to copy
it at all.  Mod_digest knows the data is modifiable, because the bucket
tells it so.  Mod_digest also knows that if it gets too much data, it can
write the data out to the disk, and just start freeing memory, because the
memory is associated with the bucket, not the pool.  Later, when
mod_Autoindex has received all the data, it knows if anything has been
written to disk, and can calculate the md5sum.

Also, having both buckets and char *'s opens us up to bugs.  One of the
incredibly nice things about the buckets, is that they allow programmers
to group the data.  This means having lists of buckets.  In the code you
have posted, if I create a list of buckets, and then I try to pass that to
a char * filter, I will only get the first bucket.  The rest are
forgotten.  (I noticed this yesterday, but I was busy hacking my patch,
and then I was honestly pissed off, and I forgot to mention it).  The
ability to string buckets together is where this feature really pays off.

>               By doing the "give me a string" in the framework, we have more
> 	      opportunity to optimize the string's construction and
> 	      handling. For example, create and destroy a pool to hold the
> 	      string while the callback is invoked. In the "build the string
> 	      inside" approach, the callback would look like:

If we associate the memory with a pool, we not only have to keep track of
that pool, but also the pool's parent pool, and it's parent's parent pool,
etc.  Once one pool is destroyed, all of the child pools are
destroyed.  This means that filters will have to either have A LOT of
information about the lifetime of pools, or all of the data lives as long
as the request.

> 	      my_callback(bucket):
> 	          pool = create_pool(bucket->r->pool)
> 		  str = give_me_string(bucket)
> 		  process_string_str)
> 		  destroy_pool(pool)

I thought of this, and it is what I was getting at when I was talking
about sibling pools, because it allows two filters to operate on two
different pools.  It also requires that each filter copy all of the data
into it's own pool before it can do anything.  OTherwise, it doesn't know
what the previous layers will do with their pools.

Look at it this way:

     my_callback(bucket)
          p = create_pool(bucket->r->pool);
          str = give_me_string(Bucket)
          process_string(Str)
                 [which calls the next layer, which doesn't have enough
                  data, and it moves the string off to the side.]
          destroy_pool(pool)
                 [The string is gone now, so the next layer had better
                  have made a copy.  If each layer does this, we end up
                  with a lot more data on the heap than we want. ]

>               For the simple callback, the pool handling occurs *outside*
> 	      the callback. It remains just as simple as above. The point
> 	      here is that we have optimization choices that will not affect
> 	      the simple_cb style of filter.

The point is that this looks like an optimization, but it is really not
one.  In reality, this design (as well as any other that requires
pools) requires more memory and pool memcpy's than are necessary.

>    SOLUTION: Per a previous email, I added a condition to the scan loop in
> 	     ap_add_filter(). I had guessed that would eventually be added,
> 	     but wanted to defer it until it was found to be really
> 	     necessary.

I need to look at that still.

> Other problems?

The content-length thing is a real issue.  You can say the spec allows us
to drop the content-length, but two people have said that isn't an
option.  Kevin Kiley and Roy.  I'll be number three.  If we are not just
going to drop the content-length, then we need to allow filters to keep
the whole request around (but not in memory).  This means being able to
flush the data to disk, and free up the memory.  With all of the current
patches, we are using pools.  This means that we have one of two options:

1)  Each layer creates it's own sub-pool.  In this case, dumping to the
disk doesn't really do us any good, because all of the previous layers
have copies of the data lying around.

2)  We use one pool.  In this case, we can't free the pool, because we
don't know who else is using it.  This means that while all of the memory
has been flushed to disk, we can't use any of the now available memory (a
previous layer could have gotten there before us).

I really hope that outlines and explains my issues with this patch.  I
couldn't sleep last night so I start fiddling with some of Dean's ap_buf
stuff.  I may have a brand new patch in the next few days, I may
not.  Please do not wait for me.  I will review anything that comes along,
but I am more and more convinced that we need to re-vamp our memory model
to make filters work.  In the meantime, I have some configuration issues
that have been bugging me.  :-)

BTW, I tend to belive (although I have no hard proof, it is a
very strong feeling) that sub-requests won't and can't work with any of
the current models.  Any sub-request that generates data (all of them),
needs to pass that data back to the original request to go through the
rest of the filters.  This finally struck me this morning.  I'm trying to
figure out how to express why this is so, but the words aren't coming
right now.  I'll try again later.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------