You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@perl.apache.org by Steve Hay <st...@uk.radan.com> on 2003/08/20 10:58:08 UTC
[PATCH] Clean up temp files [Was: Re: [PATCH mp1.28] Clean up file uploaded by test suite]

Hi Stas / Lincoln,

Lincoln: We've been having a discussion on the mod_perl dev list 
(http://marc.theaimsgroup.com/?l=apache-modperl-dev&m=106122218228508&w=2) 
about a problem with temporary files from file uploads being left behind 
by CGI.pm on Windows.  The problem occurs because the unlink is 
attempted before the file is closed, which doesn't work on Windows.  
Initially I sent a patch for the affected test script in mod_perl to 
simply close the file itself which remedies the problem, but Stas 
insisted that it was really a bug in CGI.pm.  I think he's right, and 
hopefully the attached patch (against 3.00) fixes it.

Stas Bekman wrote:

> Steve Hay wrote:
>
>> If I run this test script:
>>
>> ==========
>> use CGI;
>> my $cgi = CGI->new();
>> print    $cgi->header(),
>>    $cgi->start_html(),
>>    $cgi->start_multipart_form();
>> my $fh = $cgi->upload('upload');
>> if (defined $fh) {
>>    1 while <$fh>;
>>    print $cgi->p(sprintf("Read %d bytes", tell $fh));
>> }
>> else {
>>    print    $cgi->filefield({-name => 'upload'}),
>>        $cgi->submit();
>> }
>> print    $cgi->end_form(),
>>    $cgi->end_html();
>> ==========
>>
>> then the filehandle $fh gets left open, so when the DESTROY in 
>> CGI.pm's CGITempFile package runs and tries to unlink the temporary 
>> file it fails because you can't unlink a file while it is open on 
>> Windows.
>>
>> In fact, adding some debug into that DESTROY subroutine:
>>
>>    unlink $safe               # get rid of the file
>>        or print STDERR "Can't unlink '$safe': $!\n";
>>
>> yields this error message in my Apache error.log:
>>
>>    Can't unlink 'C:\temp\CGItemp34790': Permission denied
>>
>> If I add
>>
>>    close $fh;
>>
>> to the end of the "if" block in my test script above then the error 
>> goes away and the file is deleted.
>
>
> So, $fh is getting destroyed after $cgi, which leads to the problem. 
> If $fh could be made to destroy before $cgi, the problem could be 
> removed.
>
> I think, the proper solution is for CGI.pm to register a cleanup 
> handler, which will do the cleanup, after $cgi has gone. This will 
> work for mod_perl, for plain cgi scripts perhaps an END block will do. 
> Finally, making the returned filehandle an object, with its own 
> destroy method may do the trick as well.
>
> Another solution I can think of is to use Devel::Peek to reduce the 
> reference count of all filehandles opened by CGI.pm to 0, from 
> CGI::DESTROY, but that's probably too intrusive and it's also not the 
> best idea to rely on dev-module ;) 

I think the problem is this: the CGI object stores a Fh object and a 
CGITempFile object: they're both stashed away at the end of 
read_multipart() - the Fh object ($filehandle) in $self->{$param}, and 
the CGITempFile object ($tmpfile) in $self->{'.tmpfiles'}.  When the CGI 
object is destroyed by Perl there is no guarantee which of its contained 
objects gets destroyed first.

It would appear (on Windows, at least) that CGITempFile->DESTROY gets 
run before Fh->DESTROY.  On Windows, that doesn't work.

The attached patch (against 3.00) fixes the problem for me by filling in 
the CGI class' currently empty DESTROY method with something useful: 
Namely, make it have each Fh destroyed before the corresponding 
CGITempFile.  To make this a little easier, I've stashed the Fh object 
in $self->{'.tmpfiles'} at the end of read_multipart() as well.

I think this is easier than mucking about with cleanup handlers and/or 
END blocks.  One doesn't normally explicitly call DESTROY methods, but 
it is OK according to Perl's perlobj manpage (in the section on 
"Destructors").

(That same manpage also comments "[contained] objects will be freed and 
destroyed automatically when the current object is freed, provided no 
other references to them exist elsewhere," but in this case there are 
two problems with that - firstly, other references to them _do_ exist 
(in the caller's script), and secondly, we need to control the _order_ 
in which the contained objects are destroyed - hence I think we're 
justified in doing this.)

What do you think?

- Steve