You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/07/06 02:20:16 UTC

[Bug 4461] New: mass-check --reuse cannot deal with previously-unscanned mail

http://bugzilla.spamassassin.org/show_bug.cgi?id=4461

           Summary: mass-check --reuse cannot deal with previously-unscanned
                    mail
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: major
          Priority: P2
         Component: Masses
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: jm@jmason.org


Noted from discussion on dev.  It seems that "mass-check --reuse" essentially
works by zeroing the reused rules entirely, and this means that if a corpus
contains messages that were not scanned by SpamAssassin with network tests
active, it will simply consider those mails to be missed by the reused
rules.

There's a range of connected issues;

  - should it re-enable those rules somehow, dynamically? (I think this
    may be best.)

  - should it issue a warning?
  
  - or maintain a file containing the mass-check IDs of mails that need a full
    scan, instead?
  
  - what if the user has not installed Mail::SPF::Query, or a similar required
    module, or was running with -L, and therefore would have X-Spam-Status
    lines but never any hits on those rules?

Also worth noting that many SPF_FAIL hits on ham noted in prior mass-checks
were due to changes in SPF records over time.  Reusing SPF may be the only
viable way to mass-check SPF using old corpora.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461


duncf@debian.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |duncf@debian.org




------- Additional Comments From duncf@debian.org  2005-07-05 21:03 -------
*** Bug 4455 has been marked as a duplicate of this bug. ***



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] [review] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461





------- Additional Comments From parkerm@pobox.com  2005-07-11 22:33 -------
Subject: Re:  [review] mass-check --reuse cannot deal with previously-unscanned
 mail

That's ok, wish you had fixed up the other piece we had discussed with
checking the status of the open call for the mass_prefs file, but since
it was deemed trivial I went ahead and did it:

Sending        masses/mass-check
Transmitting file data .
Committed revision 215926.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461





------- Additional Comments From jm@jmason.org  2005-07-05 21:15 -------
oops, didn't spot the dup.  I think at least --reuse is smart enough to check
the X-Spam-Status line's version before reusing.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] [review] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461





------- Additional Comments From henry@stern.ca  2005-07-11 15:46 -------
Looks good to me.  +1



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] [review] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461





------- Additional Comments From jm@jmason.org  2005-07-07 21:36 -------
+1 looks perfect



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461





------- Additional Comments From parkerm@pobox.com  2005-07-07 17:09 -------
Created an attachment (id=3008)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=3008&action=view)
Patch File

This is the general idea, although it is just minimally tested.  I've run out
of time to test until much later tonight, so feel free to sanity test please.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461





------- Additional Comments From felicity@apache.org  2005-07-05 17:27 -------
Subject: Re:   New: mass-check --reuse cannot deal with previously-unscanned mail

On Tue, Jul 05, 2005 at 05:20:16PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
>   - what if the user has not installed Mail::SPF::Query, or a similar required
>     module, or was running with -L, and therefore would have X-Spam-Status
>     lines but never any hits on those rules?

or if the corpus has status lines from old versions (2.x for instance), etc.

These issues are why I stopped implementing reuse last year fwiw.
The way I was going to solve it was to have a plugin which recorded
the network queries and results in the header, then on mass-check,
when network queries start coming in, it would just use the header as
a cache for the results.  That was the start of the NetCache plugin,
which never got finished.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] [review] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461


parkerm@pobox.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|mass-check --reuse cannot   |[review] mass-check --reuse
                   |deal with previously-       |cannot deal with previously-
                   |unscanned mail              |unscanned mail
   Target Milestone|Undefined                   |3.1.0




------- Additional Comments From parkerm@pobox.com  2005-07-07 20:42 -------
ok tested, but I encourage folks to give it some scrutiny.

Here is basically how it works:

1) Create $spamtest with the default rules.

2) If we are running with --reuse then make a copy of the default config, then
write out the zeroed scores to mass_prefs and read that config file in, and make
a copy of the reuse config.

3) In sub wanted, when we read in the msg, if we are running with --reuse then
if X-Spam-Status exists then it will be used and we make sure we have the reuse
config loaded.  If X-Spam-Status does not exist then we will make sure that we
are running under the default config.  Of course, if we are running without
--reuse then there will be no switching around.

4) Also, I added a reuse=yes or a reuse=no to the logfile line to indicate if
rules were reused for that particular msg.

Please review for inclusion in 3.1.0-pre4



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4461] [review] mass-check --reuse cannot deal with previously-unscanned mail

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4461


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From jm@jmason.org  2005-07-11 18:07 -------
applied! (sorry Michael ;)

Sending        masses/mass-check
Transmitting file data .
Committed revision 215904.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.