You are viewing a plain text version of this content. The canonical link for it is here.

Posted to apreq-dev@httpd.apache.org by Joe Schaefer <jo...@sunstarsys.com> on 2002/01/31 04:34:47 UTC

proposed 2.0 features/layout (was Re: starting apreq 2.0)

Here's a list of proposed changes to the C API; please comment

1) use "apreq_" as the generic prefix for files and functions
in the c api.

e.g. structs like ApacheCookie and ApacheRequest replaced
by apreq_cookie_t and apreq_request_t; functions names 
and macros map similarly

  ApacheRequest_new      -> apreq_request_new
  ApacheCookieJarFetch   -> apreq_cookie_jar_fetch

The current header files are inconsistent and have
too much cruft.  Although I prefer using a common
naming convention for both macros and functions within
header files, I could live with adopting a different
scheme for macros.  But the current .h files are not
consistent in this regard, and IMO that needs to change.

2) (user-extensible) enctypes for POST data
  a) www-urlencoded
  b) multipart/form-data
  c) xml types?

Currently the logic for parsing form-data is tangled 
between apache_request.c and apache_multipart_buffer.c;
I'd like to see that replaced by a generic mechanism that
extracts all the particulars for parsing a given enctype
*out* of apache_request.c.  IOW an end-user should be able to 
"register" an enctype with apreq and provide hooks to 
handle the data parsing.  Of course we should provide support 
for the most commonly used ones, but I think we should use the 
same interface as an end user would for this.

3) persistence across various apache phases?

Just a thought- we don't offer this with apreq 1.0, but we 
could offer some internal support for an instance() in 2.0.

4) random goals for improvement:

1) better rfc compliance
2) user-tunable resource management (RAM, temp files, etc.)
3) handle chunked data gracefully

Any other cool/useful features we should consider?

-- 
Joe Schaefer

Re: proposed 2.0 features/layout (was Re: starting apreq 2.0)

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Issac Goldstand <ma...@beamartyr.net> writes:

> So "prebundled" parses methods (multipart/form-data and 
> application/www-form-urlencoded) will be included in bundled .c files
> and called when libapreq is started, or what?

More or less- the prebundled parsers should get loaded into
the apreq struct's dispatch table as part of its initialization- 
that job belongs to apreq_request_new().  A user (by that
I mean C module programmer), could add more, override those,
or just leave well-enough alone.

Besides being more flexible than the current stuff, it should
make debugging parser errors a whole lot easier.  For instance,
we actually parse multipart/form-data using LF as end-of-line 
marker; that's due to the various bugs with IE 3 & 5 on the Mac. 
If some future vendor screws up a future format, a user (or us)
can easily drop in a parser replacement, and we/they can even
arrange to have it done on a per-request (IOW per-browser) basis.

-- 
Joe Schaefer

Re: proposed 2.0 features/layout (was Re: starting apreq 2.0)

Posted by Issac Goldstand <ma...@beamartyr.net>.

Joe Schaefer wrote:

>Issac Goldstand <ma...@beamartyr.net> writes:
>
>>Maurice Aubrey wrote:
>>
>>>I think the idea is to just have a function, something
>>>like apreq_register_enctype(), that accepts an enctype and
>>>a function pointer.  And when a request is ready to be
>>>parsed it will look up the enctype in that table and call
>>>the appropriate function(s).
>>>
>
>That's exactly what I had in mind.
>
>>So it would be registered at apache startup using a callback in some 
>>other apache module?  
>>
>
>No.
>
>>If not, what's going to register the function?  
>>
>
>Presumably, the user writes something like this
>
>  apreq_request_t *apr = apreq_request_new(r, with_some_flags);
>
>  areq_request_parser_t *foo = /* some appropriate function pointer */;
>
>  apreq_register_enctype(apr, "application/sgml-form-urlencoded", foo,
>                                (void *)parser_data );
>
>  apreq_request_parse(apr); /* this calls foo(apr, parser_data) to parse
>                                sgml-encoded input */
>
>>If so, does that mean that an apache module has to be created for each
>>enctype (or "bundle" of enctypes)?
>>
>
>No, after settling on the names and features, we just stick
>apreq_register_enctype and apreq_request_parser_t in a .h file 
>and implement them.  We keep a dispatch table of enctypes somewhere 
>in the apreq_request_t struct (installing the default enctypes
>via the apreq_request_new() call).
>
>It's conceptually similar to how we implemented the UPLOAD_HOOK
>flag in Apache::Request.  The logic for it is mostly carried out in 
>apache_request.c, and it was David Welton's idea to do it that way.  
>All I'm really suggesting is that we run with it a little further 
>and push the whole parsing activity into an external function.
>
So "prebundled" parses methods (multipart/form-data and 
application/www-form-urlencoded) will be included in bundled .c files 
and called when libapreq is started, or what?

  Issac

  Issac

Re: proposed 2.0 features/layout (was Re: starting apreq 2.0)

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Issac Goldstand <ma...@beamartyr.net> writes:

> Maurice Aubrey wrote:
> >
> >I think the idea is to just have a function, something
> >like apreq_register_enctype(), that accepts an enctype and
> >a function pointer.  And when a request is ready to be
> >parsed it will look up the enctype in that table and call
> >the appropriate function(s).

That's exactly what I had in mind.

> So it would be registered at apache startup using a callback in some 
> other apache module?  

No.

> If not, what's going to register the function?  

Presumably, the user writes something like this

  apreq_request_t *apr = apreq_request_new(r, with_some_flags);

  areq_request_parser_t *foo = /* some appropriate function pointer */;

  apreq_register_enctype(apr, "application/sgml-form-urlencoded", foo,
                                (void *)parser_data );

  apreq_request_parse(apr); /* this calls foo(apr, parser_data) to parse
                                sgml-encoded input */

> If so, does that mean that an apache module has to be created for each
> enctype (or "bundle" of enctypes)?

No, after settling on the names and features, we just stick
apreq_register_enctype and apreq_request_parser_t in a .h file 
and implement them.  We keep a dispatch table of enctypes somewhere 
in the apreq_request_t struct (installing the default enctypes
via the apreq_request_new() call).

It's conceptually similar to how we implemented the UPLOAD_HOOK
flag in Apache::Request.  The logic for it is mostly carried out in 
apache_request.c, and it was David Welton's idea to do it that way.  
All I'm really suggesting is that we run with it a little further 
and push the whole parsing activity into an external function.

-- 
Joe Schaefer

Re: proposed 2.0 features/layout (was Re: starting apreq 2.0)

Posted by Issac Goldstand <ma...@beamartyr.net>.

Maurice Aubrey wrote:

>On Thu, Jan 31, 2002 at 11:28:31AM +0200, Issac Goldstand wrote:
>
>>Joe Schaefer wrote:
>>
>[snip]
>
>>>Currently the logic for parsing form-data is tangled 
>>>between apache_request.c and apache_multipart_buffer.c;
>>>I'd like to see that replaced by a generic mechanism that
>>>extracts all the particulars for parsing a given enctype
>>>*out* of apache_request.c.  IOW an end-user should be able to 
>>>"register" an enctype with apreq and provide hooks to 
>>>handle the data parsing.  Of course we should provide support 
>>>for the most commonly used ones, but I think we should use the 
>>>same interface as an end user would for this.
>>>
>>Well, I'm no expert on this, but that had overtones of shared libraries. 
>> I can think of how to do this in C++ (provide an abstract parent class, 
>>who's definition can be used to dynamically allocate a subtype of that 
>>class), and Perl's even easier - I just did that for myself, actually 
>>(eval "use" and eval "Apache::Request::Parse::$parsemethod") but I have 
>>no idea how you can do this in C...
>>
>
>I think the idea is to just have a function, something
>like apreq_register_enctype(), that accepts an enctype and
>a function pointer.  And when a request is ready to be
>parsed it will look up the enctype in that table and call
>the appropriate function(s).
>
>Maurice
>
So it would be registered at apache startup using a callback in some 
other apache module?  If not, what's going to register the function?  If 
so, does that mean that an apache module has to be created for each 
enctype (or "bundle" of enctypes)?

  Issac

Re: proposed 2.0 features/layout (was Re: starting apreq 2.0)

Posted by Maurice Aubrey <ma...@hevanet.com>.

On Thu, Jan 31, 2002 at 11:28:31AM +0200, Issac Goldstand wrote:
> Joe Schaefer wrote:
> 
[snip]
> >Currently the logic for parsing form-data is tangled 
> >between apache_request.c and apache_multipart_buffer.c;
> >I'd like to see that replaced by a generic mechanism that
> >extracts all the particulars for parsing a given enctype
> >*out* of apache_request.c.  IOW an end-user should be able to 
> >"register" an enctype with apreq and provide hooks to 
> >handle the data parsing.  Of course we should provide support 
> >for the most commonly used ones, but I think we should use the 
> >same interface as an end user would for this.
> >
> Well, I'm no expert on this, but that had overtones of shared libraries. 
>  I can think of how to do this in C++ (provide an abstract parent class, 
> who's definition can be used to dynamically allocate a subtype of that 
> class), and Perl's even easier - I just did that for myself, actually 
> (eval "use" and eval "Apache::Request::Parse::$parsemethod") but I have 
> no idea how you can do this in C...

I think the idea is to just have a function, something
like apreq_register_enctype(), that accepts an enctype and
a function pointer.  And when a request is ready to be
parsed it will look up the enctype in that table and call
the appropriate function(s).

Maurice

Re: proposed 2.0 features/layout (was Re: starting apreq 2.0)

Posted by Issac Goldstand <ma...@beamartyr.net>.

Joe Schaefer wrote:

>Here's a list of proposed changes to the C API; please comment
>
>1) use "apreq_" as the generic prefix for files and functions
>in the c api.
>
>e.g. structs like ApacheCookie and ApacheRequest replaced
>by apreq_cookie_t and apreq_request_t; functions names 
>and macros map similarly
>
>  ApacheRequest_new      -> apreq_request_new
>  ApacheCookieJarFetch   -> apreq_cookie_jar_fetch
>
>The current header files are inconsistent and have
>too much cruft.  Although I prefer using a common
>naming convention for both macros and functions within
>header files, I could live with adopting a different
>scheme for macros.  But the current .h files are not
>consistent in this regard, and IMO that needs to change.
>
Sounds smart.

>
>2) (user-extensible) enctypes for POST data
>  a) www-urlencoded
>  b) multipart/form-data
>  c) xml types?
>
>Currently the logic for parsing form-data is tangled 
>between apache_request.c and apache_multipart_buffer.c;
>I'd like to see that replaced by a generic mechanism that
>extracts all the particulars for parsing a given enctype
>*out* of apache_request.c.  IOW an end-user should be able to 
>"register" an enctype with apreq and provide hooks to 
>handle the data parsing.  Of course we should provide support 
>for the most commonly used ones, but I think we should use the 
>same interface as an end user would for this.
>
Well, I'm no expert on this, but that had overtones of shared libraries. 
 I can think of how to do this in C++ (provide an abstract parent class, 
who's definition can be used to dynamically allocate a subtype of that 
class), and Perl's even easier - I just did that for myself, actually 
(eval "use" and eval "Apache::Request::Parse::$parsemethod") but I have 
no idea how you can do this in C...

>
>3) persistence across various apache phases?
>
>Just a thought- we don't offer this with apreq 1.0, but we 
>could offer some internal support for an instance() in 2.0.
>
Are you talking about something in addition to the existing Perl 
Apache::Request->instance?   If not, why not leave it as is?  If so, why 
not code something siliar in C?

>
>4) random goals for improvement:
>
>1) better rfc compliance
>2) user-tunable resource management (RAM, temp files, etc.)
>3) handle chunked data gracefully
>
>Any other cool/useful features we should consider?
>

Not off the top of my head...  You might want to take the different MPMs 
into account and see if some things can be "enhanced" for different 
MPMs.  Same for core modules (eg, can apreq be helpful for an FTP core 
module? POP? IMAP?)

  Issac

Re: naming conventions & data structures (was Re: proposed 2.0 features/layout)

Posted by Issac Goldstand <ma...@beamartyr.net>.

Not sure if this is already part of the API, but with the new proposed 
method of registering POST handlers, is there some function that will 
tell Apache handlers what kind of content was sent to it?  I can't say 
that it would always be necessary - or even ever, perhaps - but it would 
leave a door open that could possibly become important... Once people 
can develop their own handlers for parsing the POSTs, you can never tell 
what sort of cool applications they may develop, but it's possible that 
the POST method might be a variable in some complex application...

Just my two cents...

  Issac


Joe Schaefer wrote:

>Stas Bekman <st...@stason.org> writes:
>
>>>The current header files are inconsistent and have
>>>too much cruft.  Although I prefer using a common
>>>naming convention for both macros and functions within
>>>header files, I could live with adopting a different
>>>scheme for macros.  But the current .h files are not
>>>consistent in this regard, and IMO that needs to change.
>>>
>>
>>+ for common naming convention.
>>
>
>OK- some clarification on this: macros that have function-call
>semantics (i.e. the args appear only once in the definition)
>should be lower-case;  macros that may not be safe, like say
>
>  #define apreq_MAX(a,b) ( (a) < (b) ? (b) : (a) )
>
>MUST have (some) upper-case letters.
>
>Another issue I think we should address is what sort of
>data structures we'll use to hold the param's, cookie's,
>and upload's.  In (the still-pending-Jim's release of) 1.0,
>we use apache arrays and tables, and a simple home-grown
>linked list for uploads.
>
>For 2.0, I'm leaning *against* using ap_hash because it doesn't
>seem to support multivalued keys as well as apache 1.x tables 
>do (tables are implemented as a special kind of apache array;
>the hash-like iterface for an apache table is just a ruse.)
>
>IOW, if we decide on using a real hash, it probably should be 
>list-valued, and I think that means we'll need to roll-our-own
>version of ap_hash.  IMO it will be simpler to just use a linked 
>list (a'la ap_ring.h) as the underlying structure for everything, 
>and just superimpose a hash-like interface on top of it.
>
>Thoughts?
>

Re: naming conventions & data structures (was Re: proposed 2.0 features/layout)

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Stas Bekman <st...@stason.org> writes:

> Joe Schaefer wrote:
>
> > OK- some clarification on this: macros that have function-call
> > semantics (i.e. the args appear only once in the definition)
> > should be lower-case;  macros that may not be safe, like say
> > 
> >   #define apreq_MAX(a,b) ( (a) < (b) ? (b) : (a) )
> > 
> > MUST have (some) upper-case letters.
> 
> why not having all macros UP-cased?
> 

You're right- I don't know why I even raised this issue.
Probably insufficient sleep/caffeine, or some combination
of the two.

-- 
Joe Schaefer

Re: naming conventions & data structures (was Re: proposed 2.0 features/layout)

Posted by Stas Bekman <st...@stason.org>.

Joe Schaefer wrote:
> Stas Bekman <st...@stason.org> writes:
> 
> 
>>>The current header files are inconsistent and have
>>>too much cruft.  Although I prefer using a common
>>>naming convention for both macros and functions within
>>>header files, I could live with adopting a different
>>>scheme for macros.  But the current .h files are not
>>>consistent in this regard, and IMO that needs to change.
>>>
>>
>>+ for common naming convention.
>>
> 
> OK- some clarification on this: macros that have function-call
> semantics (i.e. the args appear only once in the definition)
> should be lower-case;  macros that may not be safe, like say
> 
>   #define apreq_MAX(a,b) ( (a) < (b) ? (b) : (a) )
> 
> MUST have (some) upper-case letters.

why not having all macros UP-cased?

> Another issue I think we should address is what sort of
> data structures we'll use to hold the param's, cookie's,
> and upload's.  In (the still-pending-Jim's release of) 1.0,
> we use apache arrays and tables, and a simple home-grown
> linked list for uploads.
> 
> For 2.0, I'm leaning *against* using ap_hash because it doesn't
> seem to support multivalued keys as well as apache 1.x tables 
> do (tables are implemented as a special kind of apache array;
> the hash-like iterface for an apache table is just a ruse.)
> 
> IOW, if we decide on using a real hash, it probably should be 
> list-valued, and I think that means we'll need to roll-our-own
> version of ap_hash.  IMO it will be simpler to just use a linked 
> list (a'la ap_ring.h) as the underlying structure for everything, 
> and just superimpose a hash-like interface on top of it.
> 
> Thoughts?

If we only could throw a linked list into apr_hash entry, but it works 
only with strings :(

May be we need to raise this issue at apr-dev? I'd prefer to have an
apr-based solution rather than rolling our own. I can see other projects
needing this functionality.

IMO, the fastest (coding wise) solution would be to copy apr_hash and 
let it accept a struct instead of a string as a value. Than we can 
always link from the head struct further to get multi-valued hash.

_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/

naming conventions & data structures (was Re: proposed 2.0 features/layout)

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Stas Bekman <st...@stason.org> writes:

> > The current header files are inconsistent and have
> > too much cruft.  Although I prefer using a common
> > naming convention for both macros and functions within
> > header files, I could live with adopting a different
> > scheme for macros.  But the current .h files are not
> > consistent in this regard, and IMO that needs to change.
> 
> 
> + for common naming convention.

OK- some clarification on this: macros that have function-call
semantics (i.e. the args appear only once in the definition)
should be lower-case;  macros that may not be safe, like say

  #define apreq_MAX(a,b) ( (a) < (b) ? (b) : (a) )

MUST have (some) upper-case letters.

Another issue I think we should address is what sort of
data structures we'll use to hold the param's, cookie's,
and upload's.  In (the still-pending-Jim's release of) 1.0,
we use apache arrays and tables, and a simple home-grown
linked list for uploads.

For 2.0, I'm leaning *against* using ap_hash because it doesn't
seem to support multivalued keys as well as apache 1.x tables 
do (tables are implemented as a special kind of apache array;
the hash-like iterface for an apache table is just a ruse.)

IOW, if we decide on using a real hash, it probably should be 
list-valued, and I think that means we'll need to roll-our-own
version of ap_hash.  IMO it will be simpler to just use a linked 
list (a'la ap_ring.h) as the underlying structure for everything, 
and just superimpose a hash-like interface on top of it.

Thoughts?

-- 
Joe Schaefer

Re: proposed 2.0 features/layout (was Re: starting apreq 2.0)

Posted by Stas Bekman <st...@stason.org>.

Joe Schaefer wrote:

> Here's a list of proposed changes to the C API; please comment


Commenting only on issues that I can ;)


> 1) use "apreq_" as the generic prefix for files and functions
> in the c api.
> 
> e.g. structs like ApacheCookie and ApacheRequest replaced
> by apreq_cookie_t and apreq_request_t; functions names 
> and macros map similarly
> 
>   ApacheRequest_new      -> apreq_request_new
>   ApacheCookieJarFetch   -> apreq_cookie_jar_fetch


+1


> The current header files are inconsistent and have
> too much cruft.  Although I prefer using a common
> naming convention for both macros and functions within
> header files, I could live with adopting a different
> scheme for macros.  But the current .h files are not
> consistent in this regard, and IMO that needs to change.


+ for common naming convention.


> 3) persistence across various apache phases?
> 
> Just a thought- we don't offer this with apreq 1.0, but we 
> could offer some internal support for an instance() in 2.0.


+1, but this should be aware of threaded env. So you have to use some 
wrapper to store as a global variable or tls (thread local storage). I 
didn't look at home various mod_'s implement it but mod_perl has 
modperl_global.[hc] for taking care of these issues transparently to the 
used mpm type.


_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/