You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/07/16 20:59:26 UTC

Re: a few things

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Daniel Quinlan writes:
> 1. Shouldn't we have Return-Path: higher up on the list of envelope
>    sender headers that are checked in get_envelope_from() ?

Hmm.   Well, the idea is:

    1. if the unusual headers (X-Envelope-From, Envelope-Sender, X-Sender)
    are present and trustworthy, use them
    2. fall back to the RFC-2822 std, Return-Path, which is pretty much
    always present

> I think the
>    heuristic could probably use some work, perhaps look at the top
>    Received: line to determine a priority order for headers.

how would this work?

> 2. ROUND_THE_WORLD - is this still a tflags net test?

Yes, it may need to perform rDNS lookups.

If the S/O is bad, this would be a good candidate to drop, as the
spammer behaviour is no longer prevalent.

> 3. NO_DNS_FOR_FROM is broken ... again.  I've known this has been
>    misdesigned (using the foreground mx() function in Net::DNS) for a
>    while, but the right solution eluded me until now.
> 
>    0.000   0.0000   0.0000    0.500   0.47    1.10  NO_DNS_FOR_FROM
>    0.000   0.0000   0.0000    0.500   0.45    1.10  NO_DNS_FOR_FROM:bzoetekouw
>    0.000   0.0000   0.0000    0.500   0.48    1.10  NO_DNS_FOR_FROM:jm
>    0.000   0.0000   0.0000    0.500   0.47    1.10  NO_DNS_FOR_FROM:parkerm
>    0.000   0.0000   0.0000    0.500   0.46    1.10  NO_DNS_FOR_FROM:quinlan
>    0.000   0.0000   0.0000    0.500   0.49    1.10  NO_DNS_FOR_FROM:rODbegbie
> 
>   I have a patch to fix NO_DNS_FOR_FROM.  1000 ham and spam randomly
>   sampled from my corpus after flushing my bind DNS cache:
> 
>    4.002   8.0000   0.0000    1.000   0.94    0.00 NO_DNS_FOR_FROM
> 
>   In addition to fixing the test and producing good results, the patch:
> 
>     - changes the MX test to use background sockets

oh good.  I think there's other MX tests elsewhere in the code though...

>     - skip_rbl_checks becomes skip_dns_checks, rbl_timeout becomes
>       dns_timeout

+0.5.  Only if the old names remain as synonyms.

We can now support synonyms very easily and efficiently in the Conf code,
and I think we've broken quite enough backwards compatibility in this
release.

>     - check_mx_attempts and check_mx_delay are gone

+1.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFA+CWOQTcbUG5Y7woRAn+aAKDbOeNEFrqN1bPJIMJLgDubFYJQAwCg1TnU
4h1nIvcZASRB4LZJ6tWUeQc=
=DF1t
-----END PGP SIGNATURE-----


Re: a few things

Posted by Daniel Quinlan <qu...@pathname.com>.
jm@jmason.org (Justin Mason) writes:

> Hmm.   Well, the idea is:
> 
>     1. if the unusual headers (X-Envelope-From, Envelope-Sender, X-Sender)
>     are present and trustworthy, use them
>     2. fall back to the RFC-2822 std, Return-Path, which is pretty much
>     always present

Okay, I looked a bit more closely, I missed some of that logic, thanks.

How does this strike you?

 - if the top Received: header is fetchmail, ignore/strip it and any
   headers above it, I don't think we have to punt for these
 - if any of Return-path:, Sender:, X-Sender:, X-Envelope-From:,
   X-Env-Sender:, or 'From ' are above the first Received: line, use
   that.
 - if the first Received: line contains (envelope-from) use that
 - if a trusted result is needed, punt
 - if an untrusted result is acceptable (positive scoringe rules), then
   take the first line that matches one of the above

So, store two results: one trusted (maybe undef) and one untrusted (less
likely to be undef).
 
>> I think the heuristic could probably use some work, perhaps look at
>> the top Received: line to determine a priority order for headers.

> how would this work?

Well, I like your logic better, so I modified my thinking, however, the
idea was to look at the first Received line and judging by which MTA
added it, decide which header to grab.  There may be cases where an MTA
doesn't put its envelope-sender header above the Received: line it adds.

Maybe that would still make sense.

> > 2. ROUND_THE_WORLD - is this still a tflags net test?
> 
> Yes, it may need to perform rDNS lookups.
> 
> If the S/O is bad, this would be a good candidate to drop, as the
> spammer behaviour is no longer prevalent.

It's pretty good, but low hit rate:

  0.060   0.1069   0.0000    1.000   0.55    0.17  ROUND_THE_WORLD
  0.072   0.1144   0.0000    1.000   0.48    0.17  ROUND_THE_WORLD:bzoetekouw
  0.088   0.1750   0.0000    1.000   0.50    0.17  ROUND_THE_WORLD:jm
  0.049   0.1063   0.0000    1.000   0.48    0.17  ROUND_THE_WORLD:parkerm
  0.004   0.0084   0.0000    1.000   0.47    0.17  ROUND_THE_WORLD:quinlan
  0.237   0.2959   0.0000    1.000   0.57    0.17  ROUND_THE_WORLD:rODbegbie

(wow, it barely even works for me)
 
> oh good.  I think there's other MX tests elsewhere in the code though...

Yeah, I still have to rationalize that stuff a bit.
 
> +0.5.  Only if the old names remain as synonyms.
> 
> We can now support synonyms very easily and efficiently in the Conf
> code, and I think we've broken quite enough backwards compatibility in
> this release.

Nah.  ;-)

(If you insist...)

>>     - check_mx_attempts and check_mx_delay are gone
> 
> +1.

a way to handle ignored/deprecated options in Conf/Parser.pm ?

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/