You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Greg Stein <gs...@lyra.org> on 2000/06/30 09:42:01 UTC

char* handler (was: what are the issues?)

On Thu, Jun 29, 2000 at 11:30:51PM -0700, Roy T. Fielding wrote:
> >Point is: the char* callback does exactly what an ioblock/bucket callback
> >would do on its own when it must examine each byte.
> >
> >So, I will state again: the char* callback is not a problem. If you
> >disagree, then please explain further.
> 
> There is a significant flaw in that argument.  char * doesn't do what we
> want when a filter does not have to examine each byte.  That is the problem.
> 
> It doesn't make any sense to have two filter interfaces when you can
> accomplish the same with one and a simple parameter conversion function.

The char* handler is not suited for all filters. Granted. I've maintained
that it is simply a convenience for those filters that don't want to munge
through the bucket interface.

Consider the case where you need to crawl through all the bytes of the
content delivered into your filter (gzip, recoding, SSI, PHP, etc). Now
let's take the bucket-based interface from my patch:

  my_callback(filter, bucket)
  {
      ... how to process each character in the bucket? ...
  }

The processing gets a bit hairy, and it would be contained in all of the
each-char-walking filters. I believe Jim Winstead said something like:

    p = bucket->get_data()

Unfortunately, that isn't quite enough. If the bucket represents a file,
then you can't simply get a pointer to it (don't want to read it all into
memory, and maybe mmap is not present on the platform). This implies that
you will have a read-loop in your callback:

    read_context = prep_read_content(bucket);
    while (1) {
        p = bucket->get_more_data(read_context);

        ... process p ...
    }

Now, we can't just keep reading chunks of data from that flie endlessly
(into the heap), so we need some kind of buffer management:

    if (ctx->read_buf == NULL)
        ctx->read_buf = ap_palloc(some_pool, READ_BUF_SIZE);
    read_context = prep_read_context(bucket, ctx->read_buf);
    ...

All right.. Maybe that is "okay" for people to do in each filter, and works
for the file case.

Uh oh... what happens when somebody calls ap_rprintf()? If we attempt to
delay the actual formatting as late as possible (so that BUFF can do it
directly into the network buffer), then we pass around a fmt/va_list pair.
Our issue here is to reduce that to characters for scanning. We can use
ap_vsnprintf() to drop it into ctx->read_buf, but what happens on overflow?
Now we need to allocate a big enough block from somewhere, format into it
using ap_pvsprintf(), and then toss it out. The only "toss" that we have
right now is destroying a pool.

My intent was to simplify the job for filters. When they don't want to do
this work, they use a char* handler and get plain old bytes. Simple, clean,
and easy.

When filters can be smarter and work with files or bytes, or whatever, then
they can use the bucket-based callback. Should we encourage everybody to use
the bucket interface? Probably. Does this encouragement change the patch
that is submitted? I don't think so.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/