You are viewing a plain text version of this content. The canonical link for it is here.

Posted to apreq-dev@httpd.apache.org by Joe Schaefer <jo...@sunstarsys.com> on 2003/01/03 04:42:30 UTC

[rfc] some thoughts for apreq-2

After experimenting with a few different designs 
regarding new data apreq-2, I've settled on the
following concepts for our new data structures:

  1) apreq_tables will be the common data structure
     for both req->param and apreq_cookie_jar.  apreq_tables
     still return a char* via apreq_table_get(), but it's an 
     "enhanced" char*...

  2) the apreq_value_t is the (extensible) internal struct for the 
     table values:
  
     struct apreq_value_t {
        apr_ssize_t  size;
        char         data[1];  /* see comp.lang.c FAQ #2.6 */
     }    

     apreq_table_get() returns a pointer to the "data" string;
     the size attribute is obtainable through ANSI C's offsetof().

   3) parsers will extend apreq_value_t by placing it at the
      *end* of their own structs, since the tail of apreq_value_t
      is reserved for the data itself.

Although these choices may seem strange at first, I think they 
offer a good compromise between backwards compatibility, speed,
and a better string type for applications that need it.

Originally I was considering using apreq_value_t * throughout
the apreq_table API, but it made the API too cumbersome for
*typical* usage (I felt there was too much pointer checking & 
dereferencing).

Here's how I'm planning to use it for apreq_cookies:

  struct apreq_cookie_t {
      char *name;
      char *value;  /* typically points to v.data (see below) */
      char *path;
      char *domain;

      int   max_age;
      char *comment;
      char *commentURL;
      char *port;
      char  secure;

      apreq_cookie_version_t version; /* XXX does this belong here? */

      char *pool;
      apreq_cookie_encode_t *encode;
      apreq_cookie_decode_t *decode;
      void *cache;               /* decoded value (cached) */
      unsigned char sync;        /* coordinate state of value and cache */
      apreq_value_t v;           /* original "raw" value */
  };

We parse the incoming "Cookie" header (via apreq_cookie_parse) into a 
cookie_jar, which is just an apreq_table with enhanced values. This 
makes the API for fetching raw cookie values extremely simple:

  apreq_cookie_jar_t *jar = apreq_cookie_parse(r, NULL);
  const char *my_cval = apreq_table_get( jar, "mycookie" );
  /* decode my_cval... */

Since we only need to parse the "Cookie" header once per
request, I think we should store a pointer to the jar in
r->notes, presumably using "apreq_cookie_jar" as key.  

I also plan to use r->notes for storing a pointer to our
apreq_filter, unless someone has a better idea.

Comments?
-- 
Joe Schaefer

Re: [rfc] some thoughts for apreq-2

Posted by Eli Marmor <ma...@netmask.it>.

Joe Schaefer wrote:

> [cc'd discussion to apreq-dev list]

It may be useful to quote the rest of my e-mail, although it is a
quotation of a little old message:
-----------------------------------------------------------------------
Hi,

I forward an e-mail discussion between Stas and me.

It contains a proposal (the lines with the 1-level ">"), which some of
its parts are not new, and were discussed in the past.

I asked Stas' permission to forward it to you.

The purposes of the proposal:

1. Allow libapreq to serve not only Apache modules, but also CGI's and
   FastCGI's (the entire proposal refers only to apreq-2, which is
   based on APR (or - to be exact - supposed to be based on APR...)).

2. Allow libapreq to be both: a consumer of parameters (and/or cookies)
   AND a filter of them (not only for mod_proxy, but also to manipulate
   values on their way to an internal module).

The work which has been done so far on the porting to 2.0, shortens our
way drammatically.

A possibility that is not mentioned in the proposal, is to split the 2
layers to 2 different locations:

1. The low layer, which is dependent on APR but not on Apache, will be
   inserted into APR.
2. The high layer, which depends on Apache (brigades etc.), will be
   placed in a new directory under the "modules" directory (or even
   under the "server" directory).

The original discussion is attached (forwarded).
I don't use an open forum (like the mailing list), to avoid too much
noise.


Date: Wed, 11 Dec 2002 10:17:47 +0800
From: Stas Bekman <st...@stason.org>
To: Eli Marmor <ma...@netmask.it>
Subject: Re: Opinion about libapreq

Eli Marmor wrote:
> It's funny;
> After I sent you the question, and before receiving your response, I
> started to investigate apreq, concluding with what you answered, plus
> many other conclusions and ideas. Then, I looked at the archives of the
> mailing list, and was surprised to see that most of the things that I
> thought or found, were already discussed.
> 
> In general, the ideal could be if apreq-2 would be separated into two
> layers:
> 
> 1. The lower one: independent of Apache, but dependent on APR (it's a
>    piece of cake to port all the "ap_" to "apr_"). In this layer, there

The reason that ap_ is used instead of apr_, is because there is no apr_ 
equivalent.

>    will be no manipulation or access to request_rec or input_filters,
>    but simple parameters like a string with the params (in case of a
>    GET - the URL content after the "?", and in the case of POST - the
>    following input, including multipart), APR table of the headers, etc
> 
> 2. The higher layer, on top of the lower: with a similar functionality
>    of the current libapreq, plus the following: 2 modes of "read" (one
>    of the "consumes" the parameters - like CGI - while the other one
>    passes them to the next consumer). Possible "set"/"add"/"unset" of
>    params, so the next consumer (or backend in the case of a proxy or
>    reverse proxy) will be eaten different parameters than the originals
>    that were typed in by the browser's user.
> 
> If this is accomplished, it is easy to write a simple and thin layer
> library for CGI's, that will check the method (GET/POST), take the
> QUERY_STRING or the input data (according to the method), build an APR
> table of the headers, and call the (above) low layer. Of course, it
> will force the users of this library to link APR to their CGI (and of
> course libapreq).

Joe has already started working on hookable parsers for example. So you
can 
install your own parser to massage the data. I think he did it to handle
xml.

> It will also be easy to write a simple and thin layer library for
> FastCGI.
> 
> And a special mode of that layers (CGI/FastCGI) will run the CGI's/
> FastCGI's directly, without passing any browser/Apache/server, allowing
> higher levels of debugging, profiling and tuning (input parameters will
> be eaten using the command line flags or the standard input).
> 
> All of these "TODO's" are not a dream; Actually, I already did most of
> them for CGIC, but I think that libapreq has a more promising future,
> so I'll love to see them here. I can help, but this effort is more than
> what I can do alone. Especially when there are people who are more
> expert than me in APR, filters, and libapreq internals.
> 
> If you are interested, I'll love to be in a more interactive connection
> regarding these issues (chat?  phone?). Maybe even in Hebrew ;-)
> (unless you forgot your Hebrew...  ;-)

Sure, we can discuss this interactively. But I'd rather see this
discussion on 
the apreq list. Especially since Joe had quite a few similar ideas. I
guess 
I'd really want to have at least Joe being involved in this discussion,
but 
I'm sure that others would like to say a word too.

So we need to discuss what we want, lay out a roadmap, write a spec and
start 
implementing things.

>>>Just one last question before digging into apreq-2:
>>>
>>>Can it work also standalone (linked with APR, of course)?
>>>Or just as a part of Apache2?
>>
>>It can't, because it uses ap_ calls. Though if there is an interest it
should
>>be possible to abstract certaint calls so that it doesn't require
Apache anymore.
>>
>>It's tied to Apache for 3 reasons:
>>
>>* Certain functions should live in apr_ but they ap_
>>* Reading from the client is currently implemented ap_get_client_block,
but
>>should be easy to abstract away
>>* logging is done via ap_log_rerror, but again can be easily abstracted
away.
>>
>>So in my opinion it should be quite easy to make the lib independent of
Apache.
>>
>>
>>>In case it can, do you have any idea if this is going to change?
>>>(for example, Joe might port it to filtered-IO, without keeping
backward
>>>compatibility, in a way that will prevent it from being run
standalone).
>>
>>All these features can be optional. The library should be still usable
as it
>>is now (plus as you suggest to be apr_ only dependent, if you give a
hand to
>>accomplish that). The filtering IO is really just the way to feed the
parser,
>>there could be several feeding services available.


> Eli Marmor <ma...@netmask.it> writes:
> 
> > The main idea is to use the filtering mechanism to its limits, so apreq
> > can:
> >
> > 1. be used by modules from most of the layers (except for, obviously,
> >    the CONNECTION layer - i.e. SSL and gzip...)
> 
> Right- the idea is to make the parsed data available to the
> request (input and output) filters, as well as the content
> handler itself.
> 
> > 3. be separated into two layers, one of them independent of Apache (but
> >    only depends on APR). It allows external programs (such as CGI's and
> >    FastCGI) to use it.
> 
> I think it's possible to embed a home-brewed "request_rec" into an
> application, that basically tricks libapreq into thinking it is running
> inside httpd.  With apreq-2, there'd be a bit more work to do that since
> httpd's filter API would need emulation also.  It's not something that
> interests me personally, but if you'd like to see this happen, I'm
> not opposed to it either.
> 
> Some portions of apreq-2 could go into APR, but apreq is very much
> tied to HTTP.  After looking over my current code, I don't expect
> the "generically useful for APR*" portion to amount to very much.
> 
> TENTATIVE ROADMAP for apreq-2:
> 
> Barring unforeseen events :-), I'd like to put up an apreq-2
> repository on cvs.apache.org in the next few days.  A functional
> filter implementation isn't far off now; IOW I expect the C API to
> be reasonably polished by the end of January.
> 
> Assuming we're all basically satisfied with it by then, we can lock
> down the core API at the end of this month, and let the application
> programmers (especially the language-gluers) do their thing with
> apreq-2.

--
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.:   +972-9-766-1020          8 Yad-Harutzim St.
Fax.:   +972-9-766-1314          P.O.B. 7004
Mobile: +972-50-23-7338          Kfar-Saba 44641, Israel

Re: [rfc] some thoughts for apreq-2

Posted by Joe Schaefer <jo...@sunstarsys.com>.

[cc'd discussion to apreq-dev list]

Eli Marmor <ma...@netmask.it> writes:

> The main idea is to use the filtering mechanism to its limits, so apreq
> can:
> 
> 1. be used by modules from most of the layers (except for, obviously,
>    the CONNECTION layer - i.e. SSL and gzip...)

Right- the idea is to make the parsed data available to the
request (input and output) filters, as well as the content 
handler itself.

> 2. change values and not only read them (maybe the following is
>    possible: when an input filter is pushing the bucket-brigades to the
>    following filter, after it modified the apreq tables, the POST data
>    (or the GET after the "?") will be rewritten to reflect the new
>    values.

You can do that already, but it's

  1) cumbersome, since apache tables aren't designed for fast deletes.

  2) IMO not a good idea, since it amounts to lying to *other* apps 
     about the actual client data.  In very rare cases (e.g. a broken
     browser) it might be necessary to do that, but I can't think of
     a good reason why apreq users should *want* to modify table entries.
     The same thing goes for incoming cookies.

> 3. be separated into two layers, one of them independent of Apache (but
>    only depends on APR). It allows external programs (such as CGI's and
>    FastCGI) to use it. 

I think it's possible to embed a home-brewed "request_rec" into an 
application, that basically tricks libapreq into thinking it is running 
inside httpd.  With apreq-2, there'd be a bit more work to do that since 
httpd's filter API would need emulation also.  It's not something that
interests me personally, but if you'd like to see this happen, I'm
not opposed to it either.

Some portions of apreq-2 could go into APR, but apreq is very much
tied to HTTP.  After looking over my current code, I don't expect 
the "generically useful for APR*" portion to amount to very much.

TENTATIVE ROADMAP for apreq-2:

Barring unforeseen events :-), I'd like to put up an apreq-2 
repository on cvs.apache.org in the next few days.  A functional 
filter implementation isn't far off now; IOW I expect the C API to 
be reasonably polished by the end of January.  

Assuming we're all basically satisfied with it by then, we can lock 
down the core API at the end of this month, and let the application 
programmers (especially the language-gluers) do their thing with 
apreq-2.

-- 
Joe Schaefer