You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2009/12/09 06:08:39 UTC

[Bug 6251] New: Temporarily Reduce DNSWL scores for 3.3.0 release

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

           Summary: Temporarily Reduce DNSWL scores for 3.3.0 release
           Product: Spamassassin
           Version: 3.3.0
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Rules
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: wtogami@redhat.com


Similar to Bug #6247 for Return Path, I believe we should reduce the scores of
DNSWL before the 3.3.0 release.

http://ruleqa.spamassassin.org/20091205-r887515-n
Weekly masscheck for many weeks have been showing a minor but consistent amount
of false positives in DNSWL medium and low.  As a matter of practice I
generally do not report whitelist violations or blacklist FP's because doing so
would artificially bias the ruleqa statistics without solving underlying
problems in listing and removal policy.

Current 50_scores.cf:
score RCVD_IN_DNSWL_LOW 0 -1 0 -1
score RCVD_IN_DNSWL_MED 0 -4 0 -4
score RCVD_IN_DNSWL_HI 0 -8 0 -8

My concerns:
* DNSWL has no obvious and easy methods on their site to report violations
where spam was sent from a DNSWL listed host.
* Does DNSWL currently use spam traps as an automated method to detect
whitelist violations?  I don't know.  Whatever their regular practices are, I
strongly believe that automation is the only sustainable way for a service of
nature to be maintainable in the long-term.
* GA rescoring when DNSWL was allowed to float assigned much lower scores than
our fixed scores.

score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

I believe we should ship 3.3.0 with these as a reasonable score.  After 3.3.0
release, if DNSWL clarifies the above concerns and the DNSWL statistics in
ruleqa improve due to methodology (not one-off corpus cleaning) then I believe
it would be fully appropriate to increase the scores in sa-update.

Comments?
Votes?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@pccc.com

--- Comment #6 from Kevin A. McGrail <km...@pccc.com> 2009-12-09 07:05:07 UTC ---
> dnswl.org is mostly based on manual processes (there are regular crosschecks
> with blacklists, extensive use of DNS logs etc and additional tools).
> 
> (Full Disclosure: I'm the dnswl.org project leader)

I don't have a problem with manual processes for the record.  I also don't
necessarily believe our rules have to be based on any criteria such as email
response, etc.

This is because I agree completely with Justin that SA's entire framework is
designed to deal with FPs/FNs by weighted scoring.

Lowering the scores and running tests would be the appropriate action from my
perspective.  Additionally, the scores suggested seem good.

Therefore, I am +1 on changing the DNSWL scores to:

score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

KAM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

Justin Mason <jm...@jmason.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jm@jmason.org

--- Comment #2 from Justin Mason <jm...@jmason.org> 2009-12-09 03:26:09 UTC ---
(In reply to comment #1)
> with the old scores there is a bigger possible to see if a ip really is ham or
> spam, and from what i have seen in users@sa maillist there is a chance some
> dont like or understand what to do with there listnings that score there spam
> as ham

>From our POV, we want SA to compensate for individual rule FP/FNs, by allowing
other more trustworthy rules to "take over".  so reducing the scores is the
appropriate thing for us.  Whether or not an ip is misclassified by DNSWL can
be determined by the sysadmin using grep, or us using ruleqa, etc.  it
shouldn't have to cause problems for the user.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

AXB <al...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alex.uribl@gmail.com

--- Comment #7 from AXB <al...@gmail.com> 2009-12-09 07:09:04 UTC ---
I am +1 on changing the DNSWL scores to:

score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

Matthias Leisi <ma...@leisi.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matthias@leisi.net

--- Comment #4 from Matthias Leisi <ma...@leisi.net> 2009-12-09 04:58:09 UTC ---

> http://ruleqa.spamassassin.org/20091205-r887515-n
> Weekly masscheck for many weeks have been showing a minor but consistent amount
> of false positives in DNSWL medium and low.  As a matter of practice I

Are these actual false positives or due to trusted_networks etc. errors?  

> My concerns:
> * DNSWL has no obvious and easy methods on their site to report violations
> where spam was sent from a DNSWL listed host.

There is a feedback form and an email address. 

> * Does DNSWL currently use spam traps as an automated method to detect
> whitelist violations?  I don't know.  Whatever their regular practices are, I
> strongly believe that automation is the only sustainable way for a service of
> nature to be maintainable in the long-term.

dnswl.org is mostly based on manual processes (there are regular crosschecks
with blacklists, extensive use of DNS logs etc and additional tools).

(Full Disclosure: I'm the dnswl.org project leader)

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

Warren Togami <wt...@redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #14 from Warren Togami <wt...@redhat.com> 2009-12-15 13:42:43 UTC ---
This is a comment to satisfy Bugzilla's demand.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #3 from Justin Mason <jm...@jmason.org> 2009-12-09 03:27:20 UTC ---
also, same thing applies as in bug 6247:

by the way -- I don't know if we can safely do this in the 3.3.0 release.  We
need to ensure the changes to these rules don't affect FP%/FN% rates far from
the levels were measured at during rescoring.  (This can happen if the new
rules hit diff mails, or the scores don't sufficiently compensate for FP/FNs
measured in the rescore mass-check.)  

So even if the rules overlap sufficiently close to 100% that we can just drop
'em in as replacements, a final, pre-release step for this bug will be to
measure the _new_ scores using the rescore mass-check's logfiles (with
s/HABEAS_ACCREDITED_COI/RCVD_IN_RP_CERTIFIED/g etc.), determine FP/FN%, and
make sure it matches/improves on the old rates (or is at least still
acceptable).

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #11 from Kevin A. McGrail <km...@pccc.com> 2009-12-09 07:46:11 UTC ---
> But DNSWL in this case is a collaborative community project.  This is only a
> suggestion to improve their methodologies in such a way that would make us feel
> more comfortable to assign a greater weight to their rules.

My $0.02: even if they deliver roses and give dark chocolates or send burning
dog poo to someone who sends them an email, the scoring is going to come down
to the results.  Warm and fuzzies are more just for the initial "guess" for
throwing the rule into the mix.

KAM

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #9 from Kevin A. McGrail <km...@pccc.com> 2009-12-09 07:20:15 UTC ---
(In reply to comment #8)
> Let me reiterate that I want the DNSWL scores to be adjusted higher in
> sa-update sometime later after methodologies as noted are improved.

I would expect the scores of all rules to be continuously adjusted, removed,
etc. through sa-update.

However, I don't believe SA is officially requiring changes in methodologies
for RBLs at the level you are suggesting.  

In short, the results of the test should stand on their own.  This may seem a
bit overbroad and I'm open to suggestions but there are plenty of RBLs that
don't accept comments or respond.

I would, of course, question the ones that charge money for delisting.  That
should likely be an automatic killer for inclusion.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #10 from Warren Togami <wt...@redhat.com> 2009-12-09 07:31:57 UTC ---
OK right, in general you are correct.

But DNSWL in this case is a collaborative community project.  This is only a
suggestion to improve their methodologies in such a way that would make us feel
more comfortable to assign a greater weight to their rules.

Matthias Leisi, I want to state my appreciation I have for your team's work. 
Please know that we are not singling out DNSWL.  We are similarly reducing
scores for the other whitelists in Bug #6247.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #12 from Warren Togami <wt...@redhat.com> 2009-12-15 13:29:31 UTC ---
Current 50_scores.cf:
score RCVD_IN_DNSWL_LOW 0 -1 0 -1
score RCVD_IN_DNSWL_MED 0 -4 0 -4
score RCVD_IN_DNSWL_HI 0 -8 0 -8

# SUMMARY for threshold 5.0:
# Correctly non-spam: 703455  99.87%
# Correctly spam:     2551162  97.96%
# False positives:       911  0.13%
# False negatives:     53158  2.04%
# TCR(l=50): 26.384082  SpamRecall: 97.959%  SpamPrec: 99.964%

Replacement 50_scores.cf:
score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

# SUMMARY for threshold 5.0:
# Correctly non-spam: 703424  99.87%
# Correctly spam:     2551604  97.98%
# False positives:       942  0.13%
# False negatives:     52716  2.02%
# TCR(l=50): 26.091208  SpamRecall: 97.976%  SpamPrec: 99.963%

This makes for a negligible difference in the mcsnapshot results and seems
clearly safe in my book.  I am committing and closing this bug.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #13 from Warren Togami <wt...@redhat.com> 2009-12-15 13:40:11 UTC ---
Sending        rules/50_scores.cf
Transmitting file data .
Committed revision 891001.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

J.D. Falk <jd...@returnpath.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jdfalk@returnpath.net

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #5 from Warren Togami <wt...@redhat.com> 2009-12-09 07:00:22 UTC ---
(In reply to comment #4)
> > My concerns:
> > * DNSWL has no obvious and easy methods on their site to report violations
> > where spam was sent from a DNSWL listed host.
> 
> There is a feedback form and an email address. 

http://www.dnswl.org/request.pl
Is this the feedback form?  It really needs to be more obvious how you are
expected to use this to report violations.

> 
> > * Does DNSWL currently use spam traps as an automated method to detect
> > whitelist violations?  I don't know.  Whatever their regular practices are, I
> > strongly believe that automation is the only sustainable way for a service of
> > nature to be maintainable in the long-term.
> 
> dnswl.org is mostly based on manual processes (there are regular crosschecks
> with blacklists, extensive use of DNS logs etc and additional tools).

Are the cross-checks automated?
Are people notified if their listing gets demoted?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

Benny Pedersen <me...@junc.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |me@junc.org

--- Comment #1 from Benny Pedersen <me...@junc.org> 2009-12-09 02:07:36 UTC ---
should this changed scores hide the problem more ?

with the old scores there is a bigger possible to see if a ip really is ham or
spam, and from what i have seen in users@sa maillist there is a chance some
dont like or understand what to do with there listnings that score there spam
as ham

it should really just relist the ip in another level if the ip is spamming
imho, not change scores to get it worse as i see it

problem is as i see it is that users just complain and take the fastest
possible solution and forget what to do really :(

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

Warren Togami <wt...@redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P5                          |P1
   Target Milestone|Undefined                   |3.3.0

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6251] Temporarily Reduce DNSWL scores for 3.3.0 release

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6251

--- Comment #8 from Warren Togami <wt...@redhat.com> 2009-12-09 07:10:40 UTC ---
Let me reiterate that I want the DNSWL scores to be adjusted higher in
sa-update sometime later after methodologies as noted are improved.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.