You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2007/05/02 22:39:22 UTC

[Bug 5444] New: Non-text temp files not removed

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444

           Summary: Non-text temp files not removed
           Product: Spamassassin
           Version: unspecified
          Platform: Other
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P5
         Component: spamassassin
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: jason@electronet.net


Using version 3.2.0 .spamassassinXXXXXXXXXXtmp files containing non-text are
being left behind.  This does not occur for all messages, only a small subset of
those actually being scanned.  Mail is still being processed as expected.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From jm@jmason.org  2007-05-05 10:47 -------

> Maybe a kind person who understands more of the code could have a look at the
> spampd code and tell me whether adding $mail->finish(); is the solution without
> breaking anything. Thanks in advance.

yep, spampd (and all users of the Mail::SpamAssassin classes) should always call 
$mail->finish() on every mail object, otherwise it'll leak memory, leave
temporary files around, etc.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From dfs@roaringpenguin.com  2007-05-08 11:14 -------
OK... circular references are a problem.  Should that be opened as a separate
bug?  I'm sure there's a way to architect it so you don't need circular
references.  (I realize that would be a *huge* change to SpamAssassin, but I
think it's a worthy long-term goal.)

Regards,

David.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444


jason@electronet.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unspecified                 |3.2.0






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From felicity@apache.org  2007-05-08 13:42 -------
FWIW, I can't think of any circular references in the Message/Message::Node
code.  The objects are trees without parent references, so it's all
unidirectional.  However, the last time we tried to let Perl's garbage
collection "do the right thing", it barfed horribly (ie: didn't work), so we had
to go back to the finish() method.

FWIW, calling Message::finish() has been a documented/required part of the API
for a long time (since the release of 3.0.0), so regardless of SA's code getting
improved, it's a bug in MD imo.

Also FWIW, this is one of the reasons I like having a decently long RC cycle
before a new major release so that people can test with third party tools to
shake loose issues in both tools.  Apparently no one with MD tested 3.2 until it
was released. :(


(In reply to comment #10)
> OK... circular references are a problem.  Should that be opened as a separate
> bug?  I'm sure there's a way to architect it so you don't need circular
> references.  (I realize that would be a *huge* change to SpamAssassin, but I
> think it's a worthy long-term goal.)
> 
> Regards,
> 
> David.
> 





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From spamassassin@invoca.ch  2007-05-09 08:53 -------
The maintainer of spampd will address the issue in a new upcoming release. As a
quick solution I implemented the patch below in my spampd-2.30 rpms and things
look good after 24h in production.

--- spampd-2.30/spampd.orig     2005-10-31 20:45:53.000000000 +0100
+++ spampd-2.30/spampd  2007-05-08 10:34:57.000000000 +0200
@@ -582,6 +582,7 @@
                                $self->log(2, "%s", "rules hit for $msgid: " .
$status->get_names_of_tests_hit); }
 
                    $status->finish();
+                   $mail->finish();
      
                    # set the timeout alarm back to wherever it was at
                    alarm($previous_alarm);



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From jm@jmason.org  2007-05-08 10:54 -------
David -- the problem is that we have to maintain circular dependencies
internally, so using a DESTROY hook won't work, unless we switch to use weak
references, which aren't supported in perl 5.6.x, one of our supported
platforms.  The end result is that we need an explicit API to break those weak
refs so that GC'ing will work.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From dfs@roaringpenguin.com  2007-05-08 11:13 -------
(In reply to comment #7)
 > ... destructor ...

Never mind... I forgot about the circular references strewn all throughout
SpamAssassin. :-(





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From kmcgrail@peregrinehw.com  2007-05-09 06:51 -------
As reported by others, this bug also causes tmp files to be orphaned by
MIMEDefang using SA 3.2.0.

The following patch from David F. Skoll against the latest MIMEDefang should fix
it for MIMEDefang.  (This patch is against SVN trunk, but it applies cleanly
to 2.62.)

Testing shows that this resolves the issue for MIMEDefang / SpamAssassin 3.2.0
users.

Index: mimedefang.pl.in
===================================================================
--- mimedefang.pl.in (revision 14638)
+++ mimedefang.pl.in (working copy)
@@ -6317,6 +6317,7 @@
     my $status;
     push_status_tag("Running SpamAssassin");
     $status = $object->check($mail);
+    $mail->finish();
     pop_status_tag();
     return $status;
 }




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444


spamassassin@dostech.ca changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|3.2.1                       |3.2.2






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From jm@jmason.org  2007-05-08 11:26 -------
sure, open away.  it may be possible to redo without them...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444


spamassassin@invoca.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |spamassassin@invoca.ch




------- Additional Comments From spamassassin@invoca.ch  2007-05-05 02:45 -------
I see the same behaviour. The tmpfiles are not really left behind, they just
stay there for quite a long time, in my case many hours.
>From what I found the the issue comes from the following lines in Message.pm:

  # If the part type is not one that we're likely to want to use, go
  # ahead and write the part data out to a temp file -- why keep sucking
  # up RAM with something we're not going to use?
  #
  if ($msg->{'type'} !~ m@^(?:text/(?:plain|html)$|message\b)@) {
    my $filepath;
    ($filepath, $msg->{'raw'}) = Mail::SpamAssassin::Util::secure_tmpfile();

    if ($filepath) {
      # The temp file was created, add it to the list of pending deletions
      # we cannot just delete immediately in the POSIX idiom, as this is
      # unportable (to win32 at least)
      push @{$self->{tmpfiles}}, $filepath;
      $msg->{'raw'}->print(@{$body});
    }
  }

  # if the part didn't get a temp file, go ahead and store the data in memory
  if (!exists $msg->{'raw'}) {
    $msg->{'raw'} = $body;
  }

Simply commenting out the first section of the code makes those tempfiles not
appear, so I know the problem is here. From what I understand the tmpfile here
is created but only removed when the process terminates, to overcome the
inability of win32 to unlink an open file. Maybe there is still a more elegant
solution than keeping those file on disk for a long time.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From spamassassin@invoca.ch  2007-05-05 09:46 -------
At least in my case using spampd from
http://www.worlddesign.com/index.cfm/rd/mta/spampd.htm the problem seems to be
that it does a $status->finish(); but no $mail->finish(); for whatever reason.
My guess, whithout really understanding the whole thing, is that
$mail->finish(); would cleanup those tmpfiles. Spampd uses
Net::Server::PreForkSimple for it's worker childs and per default run with
max_requests = 20 before the child dies. So I guess only after the 20 requests,
the child dies and therefore the tmpfiles are cleaned up. And because many
childs also run in parallel one can imagine how many tmpfiles are always lying
around before they get removed.
Maybe a kind person who understands more of the code could have a look at the
spampd code and tell me whether adding $mail->finish(); is the solution without
breaking anything. Thanks in advance.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From jm@jmason.org  2007-05-08 15:41 -------
(In reply to comment #12)
> FWIW, I can't think of any circular references in the Message/Message::Node
> code.  The objects are trees without parent references, so it's all
> unidirectional.  However, the last time we tried to let Perl's garbage
> collection "do the right thing", it barfed horribly (ie: didn't work), so we had
> to go back to the finish() method.

There's probably a few cases we can clean up a little though...

> Also FWIW, this is one of the reasons I like having a decently long RC cycle
> before a new major release so that people can test with third party tools to
> shake loose issues in both tools.  Apparently no one with MD tested 3.2 until it
> was released. :(

I was pretty sure this was going to happen anyway ;)  We had a pretty long set
of rc's, but it was clear most people were avoiding them until we made a GA release.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From nstsupport@byteplant.com  2007-07-10 09:55 -------
(In reply to comment #16)
> hmm, I'm looking at this, and I don't see any SA bug left; it's all in the
> calling apps needing to call ->finish().  reopen if there's an SA bug to fix
 ;)

I am sorely tempted to reopen, because it's time to mention Windows.

The calling app here is SpamAssassin.exe, built with perl 5.8.7, and one should 
believe that it does call ->finish().

While most of the time SpamAssassin does seem to clean up OK after itself, 
there are cases where it doesn't. Some 3.2.1 users were reporting thousands of 
leftover tmp files. And no, SpamAssassin.exe was not forcibly killed with 
TerminateProcess() or similar.

The exe built from SpamAssassin 3.1.7 still works fine, so something happened 
along the way.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WORKSFORME




------- Additional Comments From jm@jmason.org  2007-06-15 05:35 -------
hmm, I'm looking at this, and I don't see any SA bug left; it's all in the
calling apps needing to call ->finish().  reopen if there's an SA bug to fix ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From spamassassin@invoca.ch  2007-05-05 14:37 -------
Justin, thanks for the clarification. I'll suggest the change to the spampd
maintainer.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444


jason@electronet.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P5                          |P1






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From jm@jmason.org  2007-07-10 10:04 -------
if there's a separate issue affecting only Windows, please open a separate bug
for that.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From dfs@roaringpenguin.com  2007-05-08 10:44 -------
IMO, having to call finish() is a bit of a hack.  When the object goes out
of scope, its destructor should do the equivalent of finish().




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444





------- Additional Comments From lwilton@earthlink.net  2007-05-05 03:00 -------
You would think they could be removed after SA has completed processing the 
message that they were created from.  By that point anything profitably looking 
at them should have terminated, as the score has already been generated.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5444] Non-text temp files not removed

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5444


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |3.2.1




------- Additional Comments From jm@jmason.org  2007-05-05 03:51 -------
(In reply to comment #2)
> You would think they could be removed after SA has completed processing the 
> message that they were created from.  By that point anything profitably looking 
> at them should have terminated, as the score has already been generated.

that's what should be happening -- iirc, the Mail::SA::Message->finish() method
deletes them?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.