You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by John Rudd <jr...@ucsc.edu> on 2006/10/08 11:13:19 UTC

First Received Header only

Is there a way to have spam assassin look at the first received header only?

I want to check certain characteristics of the first received header 
(for the current relay), like whether or not it looks like a dynamic 
hostname, etc., and boost the score based on that.  Can I do that with 
regular rules, or do I need to do that with a plug-in, or what?  (or, 
has someone else already done that?)

Re: First Received Header only

Posted by "John D. Hardin" <jh...@impsec.org>.

On Sun, 8 Oct 2006, John Rudd wrote:

> Right, but if it came from a machine in my network, I don't want
> the dynamic IP addr checks to increase the score.  Thus, I need to
> look at both.  If there's anyone in the Trusted pseudo-header,
> then don't bother with the checks, basically.

Set up a local DNSBL with your network ranges and give it a small
positive score.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Gun Control: The theory that a woman found dead in an alley, raped
  and strangled with her panty hose, is somehow morally superior to a
  woman explaining to police how her attacker got that fatal bullet
  wound. 
-----------------------------------------------------------------------

Re: First Received Header only

Posted by John Rudd <jr...@ucsc.edu>.

Clifton Royston wrote:
> On Sun, Oct 08, 2006 at 02:53:47PM -0700, John Rudd wrote:
>> Clifton Royston wrote:
>>>  Also, IIRC there's already a set of rules closely related to what
>>> you're asking, so you can base it off those.  Try looking for the
>>> definition of HELO_DYNAMIC_IPADDR and related rules.
>> Yeah, HELO_DYNAMIC_IPADDR is interesting.  It shows me how to start 
>> looking at what I want to look for.  I want to look at the rdns part, 
>> not the helo part (I don't care at all about the helo string) ... but I 
>> can write something from that for the rdns= part of the pseudo-header. 
>> Excellent information, thanks :-)
>>
>> I'm guessing I need to look at the first segment of 
>> X-Spam-Relays-Untrusted, but ONLY if there's nothing in 
>> X-Spam-Relays-Trusted ...  Guess I have to learn how to do meta rules, too.
> 
>   As I understand it, and given the examples on the TrustedRelay page,
> X-Spam-Relays-Trusted is all of *your* servers it came through on the
> way to the SA machine, so you want to ignore anything in there.  If you
> have no relays before this machine, there won't be anything in there.
> 
>

Right, but if it came from a machine in my network, I don't want the 
dynamic IP addr checks to increase the score.  Thus, I need to look at 
both.  If there's anyone in the Trusted pseudo-header, then don't bother 
with the checks, basically.

Re: First Received Header only

Posted by Clifton Royston <cl...@lava.net>.

On Sun, Oct 08, 2006 at 02:53:47PM -0700, John Rudd wrote:
> Clifton Royston wrote:
> >  Also, IIRC there's already a set of rules closely related to what
> >you're asking, so you can base it off those.  Try looking for the
> >definition of HELO_DYNAMIC_IPADDR and related rules.
> 
> Yeah, HELO_DYNAMIC_IPADDR is interesting.  It shows me how to start 
> looking at what I want to look for.  I want to look at the rdns part, 
> not the helo part (I don't care at all about the helo string) ... but I 
> can write something from that for the rdns= part of the pseudo-header. 
> Excellent information, thanks :-)
> 
> I'm guessing I need to look at the first segment of 
> X-Spam-Relays-Untrusted, but ONLY if there's nothing in 
> X-Spam-Relays-Trusted ...  Guess I have to learn how to do meta rules, too.

  As I understand it, and given the examples on the TrustedRelay page,
X-Spam-Relays-Trusted is all of *your* servers it came through on the
way to the SA machine, so you want to ignore anything in there.  If you
have no relays before this machine, there won't be anything in there.

  -- Clifton

-- 
    Clifton Royston  --  cliftonr@iandicomputing.com / cliftonr@lava.net
       President  - I and I Computing * http://www.iandicomputing.com/
 Custom programming, network design, systems and network consulting services

Re: First Received Header only

Posted by John Rudd <jr...@ucsc.edu>.

Loren Wilton wrote:
>> Here's an odd perl question: can you reference $1 and its siblings 
>> within the regex itself?  such as:
>>
>> /^\[ ip=(\d+)\.(\d+)\.(\d+)\.(\d+) rdns=\S*(0*($1|$2|$3|$4)\S){2,4}\S* 
>> [^\]]* auth= /
> 
> You can do it, but it slows down the whole regex system as soon as you 
> have a capturing regex.  Or so I'm told by the Perl regex docs.  Use 
> backslashes, not dollar signs.
> 
>> /^\[ ip=(\d+)\.(\d+)\.(\d+)\.(\d+) rdns=\S*(0*(\1|\2|\3|\4)\S){2,4}\S* 
>> [^\]]* auth= /
> 
> 
> As a side note, while I'm not completely sure what you are trying to 
> accomplish, it seems to me that if you just set your trust paths 
> correctly and enabled some of the net rules that 99% of this is already 
> caught with existing rules.  You might want to tweak some scores or add 
> some metas.  But rebuilding the entire logic to determine if the message 
> came from a forged host seems like a strange concept, when SA already 
> does that.
> 
>     

I'm not looking for a forged host.  I'm looking for someone else's 
dynamic/end client.  Only one of the 5 tests I gave looks for forged PTR 
record.

The first one rejects people who NO PTR record.  The second rejects 
people who have misconfigured their DNS in a fundamental way.  The third 
is the forged PTR record.  The fourth and fifth one are entirely about 
"is this a dynamic/end client for some other network" (meaning: someone 
who should be connecting to their own mail server instead of connecting 
directly to my mail server).  I'm not aware of any checks that duplicate 
the effort of #4 and #5.  If they already exist, I'd be glad to know.

You're right to some extent though.  I mainly care about the 4th and 5th 
check, and if SA already really handles the first 3, I don't need to 
duplicate that effort.  I do them all in one place now, in mimedefang, 
because I want to be sure that the hostname I look at for #4 and #5 is 
the actual legitimate hostname.

So... which just for my curiosity, which test specifically duplicate my 
first 3 checks (so I can adjust their scores to suit my needs)?  Or does 
SA just generally cover those topics?

Re: First Received Header only

Posted by Loren Wilton <lw...@earthlink.net>.

> Here's an odd perl question: can you reference $1 and its siblings within 
> the regex itself?  such as:
>
> /^\[ ip=(\d+)\.(\d+)\.(\d+)\.(\d+) rdns=\S*(0*($1|$2|$3|$4)\S){2,4}\S* 
> [^\]]* auth= /

You can do it, but it slows down the whole regex system as soon as you have 
a capturing regex.  Or so I'm told by the Perl regex docs.  Use backslashes, 
not dollar signs.

> /^\[ ip=(\d+)\.(\d+)\.(\d+)\.(\d+) rdns=\S*(0*(\1|\2|\3|\4)\S){2,4}\S* 
> [^\]]* auth= /


As a side note, while I'm not completely sure what you are trying to 
accomplish, it seems to me that if you just set your trust paths correctly 
and enabled some of the net rules that 99% of this is already caught with 
existing rules.  You might want to tweak some scores or add some metas.  But 
rebuilding the entire logic to determine if the message came from a forged 
host seems like a strange concept, when SA already does that.

        Loren

Re: First Received Header only

Posted by John Rudd <jr...@ucsc.edu>.

Clifton Royston wrote:
> On Sun, Oct 08, 2006 at 02:13:19AM -0700, John Rudd wrote:
>> Is there a way to have spam assassin look at the first received header only?
>>
>> I want to check certain characteristics of the first received header 
>> (for the current relay), like whether or not it looks like a dynamic 
>> hostname, etc., and boost the score based on that.  Can I do that with 
>> regular rules, or do I need to do that with a plug-in, or what?  (or, 
>> has someone else already done that?)
> 
>   This is closely related to the question I was asking a few days ago,
> and Justin Mason pointed me to the answer:
> 
>   <http://wiki.apache.org/spamassassin/TrustedRelays>
> 
>   To check the first received header, you need to make sure that you
> have TrustedNetworks correctly configured, and write your regex to only
> look at the first entry within the 'X-Spam-Relays-Untrusted'
> pseudoheader.
> 
>   Also, IIRC there's already a set of rules closely related to what
> you're asking, so you can base it off those.  Try looking for the
> definition of HELO_DYNAMIC_IPADDR and related rules.
> 

Yeah, HELO_DYNAMIC_IPADDR is interesting.  It shows me how to start 
looking at what I want to look for.  I want to look at the rdns part, 
not the helo part (I don't care at all about the helo string) ... but I 
can write something from that for the rdns= part of the pseudo-header. 
Excellent information, thanks :-)

I'm guessing I need to look at the first segment of 
X-Spam-Relays-Untrusted, but ONLY if there's nothing in 
X-Spam-Relays-Trusted ...  Guess I have to learn how to do meta rules, too.

Here's an odd perl question: can you reference $1 and its siblings 
within the regex itself?  such as:

/^\[ ip=(\d+)\.(\d+)\.(\d+)\.(\d+) rdns=\S*(0*($1|$2|$3|$4)\S){2,4}\S* 
[^\]]* auth= /

If that works, that handles looking for the IP addr in the host name in 
_decimal_, but not in hex.  If not, then I probably need to go with a 
plugin.  And I probably need to go with a plugin in if I'm going to look 
for the hex IP address.

Anyone know how to reference the pseudo-headers from within a plugin? :-}

Right now I do all of this within mimedefang ... but I think I might 
want to have it feed the SA score instead.  The things I check now are:

1) no rdns (can do with a pseudo-header rule)
2) rdns name is a CNAME not an A record (I'm sure that needs a plugin)
3) rdns name leads to an A record, A record doesn't lead to IP (plugin)
4) rdns name contains any of these strings: catv, cable, dsl, dhcp, 
ddns, dial-up, dialup, dynamic, ppp (easy with the pseudo-header rule)
5) rdns name contains 2 or more of the IP address segments (in decimal 
or hex format), optionally separated by any single character, with or 
without leading zeros. (see above)

Right now, these all lead to SMTP rejections.  I'm thinking that, with 
SA, though, I'd set the scores as: 4 + (# of those checks that didn't 
pass) (so, each rule scores 1, and a meta rule adds 4 if any of them was 
triggered).  So a total score in the range 5-9.  Or maybe just a flat 6. 
  I haven't decided.

(and I have mimedefang set to SMTP reject anything with a score >= 10; 
so these rules themselves wouldn't trigger a rejection, but the more 
spammy the message itself is, the more likely the combination would; and 
as long as the rest of the SA score isn't negative, these rules alone 
would cause it to get put into my quarantine)

BTW: the purpose of this is to catch spambots on dynamic IP addresses, 
or in people's homes, etc.  So, I make allowances for my own network's 
dynamic hosts, and for people who authenticate ... everyone else should 
go through their ISP's mail server.  The reason I'm moving away from 
automatic rejection, though, is that while I absolutely consider this to 
be a requirement, some legitimate businesses are stubborn about not 
sending their email through their ISP even though their ISP wont give 
them a custom PTR record.  I'm fine with automatically rejecting those 
sites at home, but my users may not be comfortable with me automatically 
rejecting them at work.  But, at home, it has been _amazingly_ successful.

Re: First Received Header only

Posted by John Rudd <jr...@ucsc.edu>.

John D. Hardin wrote:
> On Sun, 8 Oct 2006, Clifton Royston wrote:
> 
>>   This is closely related to the question I was asking a few days
>> ago, and Justin Mason pointed me to the answer:
>>
>>   <http://wiki.apache.org/spamassassin/TrustedRelays>
> 
> I've now gotten five copies of this message. Is anybody else getting
> dupes too?
> 

Yep.

Re: First Received Header only

Posted by "John D. Hardin" <jh...@impsec.org>.

On Sun, 8 Oct 2006, Clifton Royston wrote:

>   This is closely related to the question I was asking a few days
> ago, and Justin Mason pointed me to the answer:
> 
>   <http://wiki.apache.org/spamassassin/TrustedRelays>

I've now gotten five copies of this message. Is anybody else getting
dupes too?

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.                                    -- Red Drag Diva
-----------------------------------------------------------------------

Re: First Received Header only

Posted by Clifton Royston <cl...@lava.net>.

On Sun, Oct 08, 2006 at 02:13:19AM -0700, John Rudd wrote:
> 
> Is there a way to have spam assassin look at the first received header only?
> 
> I want to check certain characteristics of the first received header 
> (for the current relay), like whether or not it looks like a dynamic 
> hostname, etc., and boost the score based on that.  Can I do that with 
> regular rules, or do I need to do that with a plug-in, or what?  (or, 
> has someone else already done that?)

  This is closely related to the question I was asking a few days ago,
and Justin Mason pointed me to the answer:

  <http://wiki.apache.org/spamassassin/TrustedRelays>

  To check the first received header, you need to make sure that you
have TrustedNetworks correctly configured, and write your regex to only
look at the first entry within the 'X-Spam-Relays-Untrusted'
pseudoheader.

  Also, IIRC there's already a set of rules closely related to what
you're asking, so you can base it off those.  Try looking for the
definition of HELO_DYNAMIC_IPADDR and related rules.

  -- Clifton

-- 
    Clifton Royston  --  cliftonr@iandicomputing.com / cliftonr@lava.net
       President  - I and I Computing * http://www.iandicomputing.com/
 Custom programming, network design, systems and network consulting services

Re: First Received Header only

Posted by "John D. Hardin" <jh...@impsec.org>.

On Sun, 8 Oct 2006, John Rudd wrote:

> > I assume you have a clear idea of what you mean by "first"? The
> > Received header that *your* MTA is adding? The Received header that
> > the outermost MTA you trust is adding?
> 
> I mean the topmost header that my MTA is adding.

That's the simplest case. You know exactly what is going to be in the
Received: header that your MTA is adding, and you explicitly trust
that the information isn't forged.

> > How robust this is will depend on how complex your mail relay chain
> > is. If your trusted ISP has lots of exposed mail servers it may be
> > difficult to do it this way.
> 
> We don't trust any ISP.  We trust our internal hosts, and that's
> it (and, as soon as I can force the policy change, we wont even
> trust them;  it will be "do SMTP-AUTH or we treat you like a
> random host on the net").

That's not what "trust" means in this context. For SA, "trust" means
"we know headers from this server are not forged". For this purpose
you will *always* trust yourself, and you want to tell SA to trust all
the way out to the public-most known-reliable server that receives
mail for your domain, or many network tests won't work correctly.

For example, if your SA machine is "behind" a publicly-visible MTA,
but you don't trust it, then any DNSBL tests performed on the sender
address will always be testing the publicly-visible MTA (which is the
next hop past the last trusted host) rather than the actual client
that originated the message. This probably isn't going to generate any
useful information.

> We have 3 scanning hosts that are our MX servers.  I suppose the 
> above might work, as I could have the "first.trusted.host" be 
> "smtp-prod-mx-?[1-4]\.ucsc\.edu".

Right.

> Though, some of what I want to do is going to be complex enough,
> that I may want to do it with a plugin anyway. (checking to see if
> the host's IP address segments are contained within the hostname)  
> Just looking for things like "dhcp", "dsl", "dynamic", etc. in the
> hostname could be done with the above, though.

>From what you are describing, those servers are publicly visible feeds
into your mail system. You should tell SA that they are trusted so
that a lot of built-in tests that already do what you want will be
enabled and will work properly.

For example, you can easily add DNSBL tests against the various public
dynamic IP address block DNSBLs, rather than writing rules to check
the rDNS name for the IP address that is in a specific Received
header.

> > 'course, now somebody will chime in with a SA facility for doing this
> > neatly that I'm not aware of, and make me look silly (again)... :)
> > 
> > Like: is there a pseudo-header available to rules that is the
> > outermost Received header in the trust path? If not, then it might be
> > a useful addition.
> 
> That was my hope (that such a thing existed)

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.                                    -- Red Drag Diva
-----------------------------------------------------------------------

Re: First Received Header only

Posted by John Rudd <jr...@ucsc.edu>.

John D. Hardin wrote:
> On Sun, 8 Oct 2006, John Rudd wrote:
> 
>> Is there a way to have spam assassin look at the first received
>> header only?
> 
> I assume you have a clear idea of what you mean by "first"? The
> Received header that *your* MTA is adding? The Received header that
> the outermost MTA you trust is adding?

I mean the topmost header that my MTA is adding.  Though, I can see how 
the 2nd one might have more generic meaning to larger SA audience.

>> I want to check certain characteristics of the first received header 
>> (for the current relay), like whether or not it looks like a dynamic
>> hostname, etc., and boost the score based on that.  Can I do that with 
>> regular rules, or do I need to do that with a plug-in, or what?  (or, 
>> has someone else already done that?)
> 
> Some of that is already done automatically by SA. That is why you
> define the trust path. But if that's not sufficient:
> 
> There is probably some static information in the desired Received
> header that you can key off, e.g. the "by {hostname}" part:
> 
>  header FNORD Received =~ /some_test.*(?:\bby\sfirst\.trusted\.host)/i
> 
> ...where you'd vary the some_test part over multiple rules to check
> for different things in the header picked out by the constant
> first.trusted.host hostname part.
> 
> How robust this is will depend on how complex your mail relay chain
> is. If your trusted ISP has lots of exposed mail servers it may be
> difficult to do it this way.

We don't trust any ISP.  We trust our internal hosts, and that's it 
(and, as soon as I can force the policy change, we wont even trust them; 
it will be "do SMTP-AUTH or we treat you like a random host on the 
net").  We have 3 scanning hosts that are our MX servers.  I suppose the 
above might work, as I could have the "first.trusted.host" be 
"smtp-prod-mx-?[1-4]\.ucsc\.edu".

Though, some of what I want to do is going to be complex enough, that I 
may want to do it with a plugin anyway. (checking to see if the host's 
IP address segments are contained within the hostname)  Just looking for 
things like "dhcp", "dsl", "dynamic", etc. in the hostname could be done 
with the above, though.

> 'course, now somebody will chime in with a SA facility for doing this
> neatly that I'm not aware of, and make me look silly (again)... :)
> 
> Like: is there a pseudo-header available to rules that is the
> outermost Received header in the trust path? If not, then it might be
> a useful addition.

That was my hope (that such a thing existed)

Re: First Received Header only

Posted by "John D. Hardin" <jh...@impsec.org>.

On Sun, 8 Oct 2006, John Rudd wrote:

> Is there a way to have spam assassin look at the first received
> header only?

I assume you have a clear idea of what you mean by "first"? The
Received header that *your* MTA is adding? The Received header that
the outermost MTA you trust is adding?

> I want to check certain characteristics of the first received header 
> (for the current relay), like whether or not it looks like a dynamic
> hostname, etc., and boost the score based on that.  Can I do that with 
> regular rules, or do I need to do that with a plug-in, or what?  (or, 
> has someone else already done that?)

Some of that is already done automatically by SA. That is why you
define the trust path. But if that's not sufficient:

There is probably some static information in the desired Received
header that you can key off, e.g. the "by {hostname}" part:

 header FNORD Received =~ /some_test.*(?:\bby\sfirst\.trusted\.host)/i

...where you'd vary the some_test part over multiple rules to check
for different things in the header picked out by the constant
first.trusted.host hostname part.

How robust this is will depend on how complex your mail relay chain
is. If your trusted ISP has lots of exposed mail servers it may be
difficult to do it this way.

'course, now somebody will chime in with a SA facility for doing this
neatly that I'm not aware of, and make me look silly (again)... :)

Like: is there a pseudo-header available to rules that is the
outermost Received header in the trust path? If not, then it might be
a useful addition.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.                                    -- Red Drag Diva
-----------------------------------------------------------------------