You are viewing a plain text version of this content. The canonical link for it is here.

Posted to apreq-dev@httpd.apache.org by Jeffrey Horner <je...@vanderbilt.edu> on 2005/07/07 19:42:47 UTC

Generated spool file okay to copy after parse?

Hello,

I'm linking against libapreq2-2.05-dev and I'd like to know if the 
generated spool files are okay to copy after parsing the request POST 
data. Consider the following code (executed from within a request handler):

formd = apr_table_make(r->pool, APREQ_DEFAULT_NELTS);
filed = apr_table_make(r->pool, APREQ_DEFAULT_NELTS);

if (strcmp(r->method,"POST") == 0){

     psr = apreq_parser_make(
             r->pool,
             r->connection->bucket_alloc,
             apr_table_get(r->headers_in,"Content-Type"), 

             apreq_parser(apr_table_get(r->headers_in,"Content-Type")),
             0, /* FORCE TO SPOOL FILES */
             "/tmp",
             NULL,
             NULL);
     bb = apr_brigade_create(r->pool,r->connection->bucket_alloc);
     while((s = 
ap_get_brigade(r->input_filters,bb,AP_MODE_READBYTES,APR_BLOCK_READ,HUGE_STRING_LEN)) 
== APR_SUCCESS){
             apreq_parser_run(psr,formd,bb);
             if (APR_BUCKET_IS_EOS(APR_BRIGADE_LAST(bb))) break;
             apr_brigade_cleanup(bb);
         }

     filed = apreq_uploads(formd,r->pool);
}
    /* RUN SCRIPT ENGINE ALLOWING SCRIPTS TO COPY SPOOL FILES TO DESIRED
     * LOCATION IF filed non-empty
     */

Since I set the brigade limit to 0, libapreq will generate a brigade 
with 1 spool bucket for each file param uploaded. Poking through the 
libapreq code and the apr code, I see that on UNIX that writev() is 
called to ultimately write to the spool file. Can I reasonable assume 
that it's okay to expose that spool file name to my script engine to 
allow code to copy that file to another location? If not, should I flush 
the spool file descriptor first, then copy? Any other suggestions?

TIA

BTW - the whole apreq_param_t and apreq_value_t code is pretty sweet.
-- 
Jeffrey Horner       Computer Systems Analyst         School of Medicine
615-322-8606         Department of Biostatistics   Vanderbilt University

Re: Generated spool file okay to copy after parse?

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Jeffrey Horner <je...@vanderbilt.edu> writes:

> I don't claim expertise at understanding the Perl glue, but you might
> have a bug on your hands for the Apache2::Params::upload_link()
> function. If the brigade limit was set during parsing such that a file
> upload brigade contained heap buckets and a spool bucket at the end,
> then upload_link() looks like it may just copy the spool bucket,
> ignoring the heap buckets.

As I said, we cheat.  When the spool bucket is first constructed,
the previous heap buckets are written directly to the spool file.
The remainder of the upload is then appended directly to the file.
This cheat is what makes upload_link() actually work right.

> Regardless, since libapreq already exposes the spool bucket file via
> the api call apreq_brigade_spoolfile(), plus the fact the the Perl
> glue uses it, I think it's a safe bet that the api will support some
> sort of access to the tmp file created in the future.

Exactly.  We'll need to do something intelligent to deal
with the situation; apreq_brigade_spoolfile is really a 
first pass at it.  I'm just hesitant to endorse it as the
final solution for libapreq2.

-- 
Joe Schaefer

Re: Generated spool file okay to copy after parse?

Posted by Jeffrey Horner <je...@vanderbilt.edu>.

Joe Schaefer wrote:
[...]
> Certainly doable, but I think it'd be overkill.  We deal with these 
> exact same issues (by cheating and peeking at the spool bucket 
> implementation) in the perl glue, so having a common C API makes the
> most sense to me.
> 

I don't claim expertise at understanding the Perl glue, but you might
have a bug on your hands for the Apache2::Params::upload_link()
function. If the brigade limit was set during parsing such that a file
upload brigade contained heap buckets and a spool bucket at the end,
then upload_link() looks like it may just copy the spool bucket,
ignoring the heap buckets.

Regardless, since libapreq already exposes the spool bucket file via the
api call apreq_brigade_spoolfile(), plus the fact the the Perl glue uses
it, I think it's a safe bet that the api will support some sort of
access to the tmp file created in the future.

-- 
Jeffrey Horner       Computer Systems Analyst         School of Medicine
615-322-8606         Department of Biostatistics   Vanderbilt University

Re: Generated spool file okay to copy after parse?

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Jeffrey Horner <je...@vanderbilt.edu> writes:

> I'm not sure I like the idea of spooling the files to a tmp file just
> to copy them to another tmp file, which will ultimately be copied to
> another file by the script writer. Imagine this cost with a 500Mb file.

I understand it's a tricky balance.  IMO the right thing for us to do 
in apreq-dev@ is add some kind of apreq_brigade_link_file() which 
understands hard links (when available).  That's pretty much what we 
already do in the perl glue, and we could just promote that code to 
the C API.

> My next idea is to create a hook to siphon off all the brigades for
> file params and write to the tmp files myself. This way, I'm using
> the libapreq api and not presuming what the internals do...
>
> How's that sound?

Certainly doable, but I think it'd be overkill.  We deal with these 
exact same issues (by cheating and peeking at the spool bucket 
implementation) in the perl glue, so having a common C API makes the
most sense to me.

-- 
Joe Schaefer

Re: Generated spool file okay to copy after parse?

Posted by Jeffrey Horner <je...@vanderbilt.edu>.

Joe Schaefer wrote:
[...]
> Maybe :-).  The internal behavior of the spool brigades is something
> we need to work on from the httpd-side, since they have slightly
> different needs.  I want to see our spool buckets become a bit
> smarter: right now mod_apreq2 is creating multiple spool files
> when it could (in principle) consolidate them into a single file.
> The safest thing for you to do right now is to write copy the brigade
> to a separate file.  You shouldn't peek inside the brigade to
> look at the bucket types yet; ideally we'll provide another 
> utility function to intelligently write the spool brigade to 
> a file (here's my +1 for a patch to apreq_util.[ch] that adds an 
> apreq_brigade_copy_to_file function).

I'm not sure I like the idea of spooling the files to a tmp file just to 
copy them to another tmp file, which will ultimately be copied to 
another file by the script writer. Imagine this cost with a 500Mb file.

My next idea is to create a hook to siphon off all the brigades for file 
params and write to the tmp files myself. This way, I'm using the 
libapreq api and not presuming what the internals do...

How's that sound?

-- 
Jeffrey Horner       Computer Systems Analyst         School of Medicine
615-322-8606         Department of Biostatistics   Vanderbilt University

Re: Generated spool file okay to copy after parse?

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Jeffrey Horner <je...@vanderbilt.edu> writes:

> Since I set the brigade limit to 0, libapreq will generate a brigade with 1
> spool bucket for each file param uploaded. Poking through the libapreq
> code and the apr code, I see that on UNIX that writev() is called to
> ultimately write to the spool file. Can I reasonable assume that it's
> okay to expose that spool file name to my script engine to allow code
> to copy that file to another location? 

Maybe :-).  The internal behavior of the spool brigades is something
we need to work on from the httpd-side, since they have slightly
different needs.  I want to see our spool buckets become a bit
smarter: right now mod_apreq2 is creating multiple spool files
when it could (in principle) consolidate them into a single file.
The safest thing for you to do right now is to write copy the brigade
to a separate file.  You shouldn't peek inside the brigade to
look at the bucket types yet; ideally we'll provide another 
utility function to intelligently write the spool brigade to 
a file (here's my +1 for a patch to apreq_util.[ch] that adds an 
apreq_brigade_copy_to_file function).

> BTW - the whole apreq_param_t and apreq_value_t code is pretty sweet.

Glad you like it ;-)

-- 
Joe Schaefer