You are viewing a plain text version of this content. The canonical link for it is here.
Posted to apreq-dev@httpd.apache.org by Joe Schaefer <jo...@sunstarsys.com> on 2003/01/19 17:46:30 UTC
apreq-2 uploads as bucket brigades?
I'd like to drop the req->upload linked list of temp files,
and use bucket brigades in their place. I'd planning to
use a common apreq_param_t struct which extends apreq_value_t
like so:
struct apreq_param_t {
enum { ASCII, UTF_8, UTF_16, IS0_LATIN_1 } charset;
char *language;
apreq_table_t *info; /* mime headers */
apr_bucket_brigade *bb; /* represents file contents */
apreq_value_t v;
};
The main advantage of this approach would be that we can use
the full bucket api for managing the upload data. The brigade
could use heap-allocated buckets for smaller uploads, and then
switch over to mmapped or file buckets after a certain size.
The bucket API makes this look pretty easy.
The main disadvantage of this approach would be that we'd need
to support the full bucket api for managing the upload.
With actual files, seek(), dup(), link(), and buffered read()
are natural operations. I think we'd have to provide a
compatibility layer for some of these, perhaps using a
meta-bucket at the front of the brigade?
--
Joe Schaefer
Re: apreq-2 uploads as bucket brigades?
Posted by Stas Bekman <st...@stason.org>.
Joe Schaefer wrote:
> Stas Bekman <st...@stason.org> writes:
>
>
>>Joe Schaefer wrote:
>>
>>>I'd like to drop the req->upload linked list of temp files,
>>>and use bucket brigades in their place. I'd planning to
>>>use a common apreq_param_t struct which extends apreq_value_t
>>>like so:
>>>
>>>struct apreq_param_t {
>>> enum { ASCII, UTF_8, UTF_16, IS0_LATIN_1 } charset;
>>> char *language;
>>> apreq_table_t *info; /* mime headers */
>>>
>>> apr_bucket_brigade *bb; /* represents file contents */
>>>
>>> apreq_value_t v;
>>>};
>>>
>>>
>>>The main advantage of this approach would be that we can use
>>>the full bucket api for managing the upload data. The brigade
>>>could use heap-allocated buckets for smaller uploads, and then
>>>switch over to mmapped or file buckets after a certain size.
>>>The bucket API makes this look pretty easy.
>>
>>And you can even write upload hook filters with that ;)
>>
>>I agree that using a polished bb API, will make apreq's code more
>>robust, and may be shorter?
>
>
> Right. Hopefully it'll make it easier to use apreq-2 within an
> asynchronous IO environment as well.
>
> By the way, is there a TIEHANDLE API for bucket brigades in modperl
> 2? The core issue I'm concerned about is: what happens when an
> apreq-2 application wants to read the whole brigade, i.e.
>
> my $fh = $upload->fh;
> print while <$fh>;
>
> ? The file-buckets within the brigade will start generating
> heap-buckets. It *must* be our job to manage those additional
> heap-buckets, not the application programmer's job. Perl's
> refcounts should work well for this, but that might mean we'd
> end up pushing the "FILE API" into the Perl glue and out
> of the apreq-2 core.
Look at the streaming filters API:
http://perl.apache.org/docs/2.0/user/handlers/filters.html#Stream_oriented_Output_Filter
These can be adopted to work with upload objects as well.
A complete TIEHANDLE is planned for filters (implemented only partially),
and should be easy to re-use for file uploads as well.
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: apreq-2 uploads as bucket brigades?
Posted by Joe Schaefer <jo...@sunstarsys.com>.
Stas Bekman <st...@stason.org> writes:
> Joe Schaefer wrote:
> > I'd like to drop the req->upload linked list of temp files,
> > and use bucket brigades in their place. I'd planning to
> > use a common apreq_param_t struct which extends apreq_value_t
> > like so:
> >
> > struct apreq_param_t {
> > enum { ASCII, UTF_8, UTF_16, IS0_LATIN_1 } charset;
> > char *language;
> > apreq_table_t *info; /* mime headers */
> >
> > apr_bucket_brigade *bb; /* represents file contents */
> >
> > apreq_value_t v;
> > };
> >
> >
> > The main advantage of this approach would be that we can use
> > the full bucket api for managing the upload data. The brigade
> > could use heap-allocated buckets for smaller uploads, and then
> > switch over to mmapped or file buckets after a certain size.
> > The bucket API makes this look pretty easy.
>
> And you can even write upload hook filters with that ;)
>
> I agree that using a polished bb API, will make apreq's code more
> robust, and may be shorter?
Right. Hopefully it'll make it easier to use apreq-2 within an
asynchronous IO environment as well.
By the way, is there a TIEHANDLE API for bucket brigades in modperl
2? The core issue I'm concerned about is: what happens when an
apreq-2 application wants to read the whole brigade, i.e.
my $fh = $upload->fh;
print while <$fh>;
? The file-buckets within the brigade will start generating
heap-buckets. It *must* be our job to manage those additional
heap-buckets, not the application programmer's job. Perl's
refcounts should work well for this, but that might mean we'd
end up pushing the "FILE API" into the Perl glue and out
of the apreq-2 core.
--
Joe Schaefer
Re: apreq-2 uploads as bucket brigades?
Posted by Stas Bekman <st...@stason.org>.
Joe Schaefer wrote:
> I'd like to drop the req->upload linked list of temp files,
> and use bucket brigades in their place. I'd planning to
> use a common apreq_param_t struct which extends apreq_value_t
> like so:
>
> struct apreq_param_t {
> enum { ASCII, UTF_8, UTF_16, IS0_LATIN_1 } charset;
> char *language;
> apreq_table_t *info; /* mime headers */
>
> apr_bucket_brigade *bb; /* represents file contents */
>
> apreq_value_t v;
> };
>
>
> The main advantage of this approach would be that we can use
> the full bucket api for managing the upload data. The brigade
> could use heap-allocated buckets for smaller uploads, and then
> switch over to mmapped or file buckets after a certain size.
> The bucket API makes this look pretty easy.
And you can even write upload hook filters with that ;)
I agree that using a polished bb API, will make apreq's code more robust,
and may be shorter?
> The main disadvantage of this approach would be that we'd need
> to support the full bucket api for managing the upload.
> With actual files, seek(), dup(), link(), and buffered read()
> are natural operations. I think we'd have to provide a
> compatibility layer for some of these, perhaps using a
> meta-bucket at the front of the brigade?
Why would you really want to support all of them? Just croak for
non-implemented ones and if wanted the implementation can come later.
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: apreq-2 uploads as bucket brigades?
Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 06:24 PM 1/21/2003, Stas Bekman wrote:
>Joe Schaefer wrote:
>
>>>FILE*'s can be passed to Tcl to create new file handles, and this is
>>>something that I feel is useful, because it gives you a really simple
>>>API for dealing with smaller files, even if it's not terribly
>>>efficient.
>>
>>It is useful, natural, and convenient. I'd just like to avoid making it mandatory like we did with apreq-1. XForms will present a new set of needs for our mfd parser, and the extra flexibility of brigades over
>>a vanilla FILE* pointer could be a really big winner there.
>
>Though it's not crossplatform. That's why apr is using apr_file_t and provides the method to extract the native implementation (FILE/HANDLE/...). I believe we should be using apr_file_t and not FILE*.
I would agree (as an apr hack.)
Can I suggest you also consider the benefit of a custom bucket type?
It might be possible to support the seek/read/write model against the
set-aside file while still supporting the brigade read. This *could* be
a very happy compromise.
I can't go into it tonight (trying to get the Win9X users on 2.0.44
unbroken) but I'd be happy to share some more thoughts on this
later in the week.
Bill
Re: apreq-2 uploads as bucket brigades?
Posted by "David N. Welton" <da...@dedasys.com>.
Stas Bekman <st...@stason.org> writes:
> Joe Schaefer wrote:
> >>FILE*'s can be passed to Tcl to create new file handles, and this
> >>is something that I feel is useful, because it gives you a really
> >>simple API for dealing with smaller files, even if it's not
> >>terribly efficient.
> > It is useful, natural, and convenient. I'd just like to avoid
> > making it mandatory like we did with apreq-1. XForms will present
> > a new set of needs for our mfd parser, and the extra flexibility
> > of brigades over a vanilla FILE* pointer could be a really big
> > winner there.
> Though it's not crossplatform. That's why apr is using apr_file_t
> and provides the method to extract the native implementation
> (FILE/HANDLE/...). I believe we should be using apr_file_t and not
> FILE*.
+1 to that. Tcl uses the same thing (the native implementation), I
and it would be better to have a HANDLE instead of a FILE on windows.
--
David N. Welton
Consulting: http://www.dedasys.com/
Personal: http://www.dedasys.com/davidw/
Free Software: http://www.dedasys.com/freesoftware/
Apache Tcl: http://tcl.apache.org/
Re: apreq-2 uploads as bucket brigades?
Posted by Stas Bekman <st...@stason.org>.
Joe Schaefer wrote:
>>FILE*'s can be passed to Tcl to create new file handles, and this is
>>something that I feel is useful, because it gives you a really simple
>>API for dealing with smaller files, even if it's not terribly
>>efficient.
>
>
> It is useful, natural, and convenient. I'd just like to avoid making
> it mandatory like we did with apreq-1. XForms will present a new set
> of needs for our mfd parser, and the extra flexibility of brigades over
> a vanilla FILE* pointer could be a really big winner there.
Though it's not crossplatform. That's why apr is using apr_file_t and provides
the method to extract the native implementation (FILE/HANDLE/...). I believe
we should be using apr_file_t and not FILE*.
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: apreq-2 uploads as bucket brigades?
Posted by Joe Schaefer <jo...@sunstarsys.com>.
davidw@dedasys.com (David N. Welton) writes:
> Joe Schaefer <jo...@sunstarsys.com> writes:
>
> > Performance is one reason. If we don't *need* apreq to use the OS's
> > filesystem for spooling, we shouldn't. We can leave that up to the
> > application, but we could make it easy for them to pretend uploads
> > are always files by providing things like
>
> > apreq_upload_link(apreq_upload_t *upload)
> > apreq_upload_seek(apreq_upload_t *upload)
>
> > instead of just handing them a *FILE pointer to play with.
>
> Forgive me if I haven't been following as closely as I should, but
> would it still be possible to get ahold of a FILE* somehow?
I think so. But I'd like to have the *FILE API to be lazy.
In other words, apreq shouldn't generate a *FILE internally unless
1) the upload is too large to store in memory, or
2) an apreq user (not a "hook author") wants to deal
with the upload data via FILE*, in which case apreq
can generate the FILE* handle if it hasn't already done so.
What I'm looking for is an internal brigade design that manages
such behavior, which is why I suggested using a meta bucket.
It would be really cool (hint) if someone took a crack at this while
I'm working on the rest of apreq-2's core. On a related note, the
header docs for apreq_tables.h and apreq_cookie.h are stable in the
current httpd-apreq-2 cvs, but certainly not complete. People looking
for things to do might want to have a look at fixing those, or
contributing a documentation system (any favorites?), or writing unit
tests for tables and cookies. I expect to have a working core library
within the next week, as well as one stand-alone environment available
for testing.
> FILE*'s can be passed to Tcl to create new file handles, and this is
> something that I feel is useful, because it gives you a really simple
> API for dealing with smaller files, even if it's not terribly
> efficient.
It is useful, natural, and convenient. I'd just like to avoid making
it mandatory like we did with apreq-1. XForms will present a new set
of needs for our mfd parser, and the extra flexibility of brigades over
a vanilla FILE* pointer could be a really big winner there.
--
Joe Schaefer
Re: apreq-2 uploads as bucket brigades?
Posted by "David N. Welton" <da...@dedasys.com>.
Joe Schaefer <jo...@sunstarsys.com> writes:
> Performance is one reason. If we don't *need* apreq to use the OS's
> filesystem for spooling, we shouldn't. We can leave that up to the
> application, but we could make it easy for them to pretend uploads
> are always files by providing things like
> apreq_upload_link(apreq_upload_t *upload)
> apreq_upload_seek(apreq_upload_t *upload)
> instead of just handing them a *FILE pointer to play with.
Forgive me if I haven't been following as closely as I should, but
would it still be possible to get ahold of a FILE* somehow? FILE*'s
can be passed to Tcl to create new file handles, and this is something
that I feel is useful, because it gives you a really simple API for
dealing with smaller files, even if it's not terribly efficient.
--
David N. Welton
Consulting: http://www.dedasys.com/
Personal: http://www.dedasys.com/davidw/
Free Software: http://www.dedasys.com/freesoftware/
Apache Tcl: http://tcl.apache.org/
Re: apreq-2 uploads as bucket brigades?
Posted by Joe Schaefer <jo...@sunstarsys.com>.
"Issac Goldstand" <ma...@beamartyr.net> writes:
> OK, but...
> Why not do both? Eg, provide access via bucket brigades
> and also keep the spooled files for more natural seek()ing
> (,etc)?
Performance is one reason. If we don't *need* apreq to use
the OS's filesystem for spooling, we shouldn't. We can leave
that up to the application, but we could make it easy for them
to pretend uploads are always files by providing things like
apreq_upload_link(apreq_upload_t *upload)
apreq_upload_seek(apreq_upload_t *upload)
...
instead of just handing them a *FILE pointer to play with.
IOW, generate the upload structs from the underlying
brigade, and let that struct be responsible for
emulating whatever additional file-type APIs we need.
Of course the upload_hook API will also need to change,
since the hooks would be responsible for generating the
bucket brigade. upload_hooks should be using the
bucket brigade api, *not* our apreq_upload api. The
hooks are really parser extensions, so they should be
stream-oriented anyways.
--
Joe Schaefer
Re: apreq-2 uploads as bucket brigades?
Posted by Issac Goldstand <ma...@beamartyr.net>.
OK, but...
Why not do both? Eg, provide access via bucket brigades and also keep the
spooled files for more natural seek()ing (,etc)?
Issac
----- Original Message -----
From: "Joe Schaefer"
Subject: apreq-2 uploads as bucket brigades?
>
> I'd like to drop the req->upload linked list of temp files,
> and use bucket brigades in their place. I'd planning to
> use a common apreq_param_t struct which extends apreq_value_t
> like so:
>
> struct apreq_param_t {
> enum { ASCII, UTF_8, UTF_16, IS0_LATIN_1 } charset;
> char *language;
> apreq_table_t *info; /* mime headers */
>
> apr_bucket_brigade *bb; /* represents file contents */
>
> apreq_value_t v;
> };
>
>
> The main advantage of this approach would be that we can use
> the full bucket api for managing the upload data. The brigade
> could use heap-allocated buckets for smaller uploads, and then
> switch over to mmapped or file buckets after a certain size.
> The bucket API makes this look pretty easy.
>
> The main disadvantage of this approach would be that we'd need
> to support the full bucket api for managing the upload.
> With actual files, seek(), dup(), link(), and buffered read()
> are natural operations. I think we'd have to provide a
> compatibility layer for some of these, perhaps using a
> meta-bucket at the front of the brigade?
>
> --
> Joe Schaefer
>