You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by Apache Wiki <wi...@apache.org> on 2005/06/28 04:19:16 UTC

[Spamassassin Wiki] Update of "TrustedRelays" by JustinMason

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/TrustedRelays

The comment on the change is:
a comprehensive page on how header trust works (at last)

New page:
= Trusted relays, and how header trust works =

SpamAssassin will automatically attempt to figure out which Received: headers were inserted by trustworthy mailservers, and which were not.  This allows it to:

 * optimize DNSBL lookups
 * detect when mails never left a trusted network path
 * and know when a Received header can be trusted for whitelisting purposes

This page details the concept of 'trust' internally to SpamAssassin, and how it appears in the output.  See also TrustPath for details on how to influence this by setting the 'trusted_networks' parameter, etc.

== An example ==

Here's an example email, with sets of headers for analysis:

{{{
Received: from internal.example.com [127.0.0.1] by localhost
    for someone@example.com; Fri, 07 Dec 2001 11:07:40 +1100 (EST)
Received: from dmz.example.com [150.51.53.1] by internal.example.com
    for someone@example.com; Fri, 07 Dec 2001 11:07:35 +1100 (EST)
Received: from friend.example.com [212.17.35.14] by dmz.example.com
    for someone@example.com; Fri, 07 Dec 2001 11:07:35 +1100 (EST)
Received: from notrust.example.com [193.120.149.226] by friend.example.com
    for someone@example.com; Fri, 07 Dec 2001 11:07:30 +1100 (EST)
Received: from loser.example.org [61.119.13.18] by notrust.example.com
    for someone@example.com; Fri, 07 Dec 2001 11:07:25 +1100 (EST)
Received: from chaos.example.net [210.73.88.134] by loser.example.org
    for someone@example.com; Fri, 07 Dec 2001 11:07:20 +1100 (EST)
Received: from evil.example.net [144.137.3.98] by chaos.example.net
    for someone@example.com; Fri, 07 Dec 2001 11:07:15 +1100 (EST)
From: "DNSBL Testing" <sp...@example.com>
Subject: no subject needed
Date: Fri, 7 Dec 2001 07:01:03
Message-Id: <20...@mail.netnoteinc.com>

hello
}}}

Assuming the scanner is running on 'internal.example.com', and the TrustPath on that machine is set up to trust the DMZ machine 'dmz.example.com' at 150.51.53.1 (and also consider that internal using 'internal_networks'), and a trustworthy external machine 'friend.example.com' at 212.17.35.14, this means that the message passed through the following relays:

 * '''Untrusted''' source at evil.example.net [144.137.3.98]
 * '''Untrusted''' relay at chaos.example.net [210.73.88.134]
 * '''Untrusted''' relay at loser.example.org [61.119.13.18]
 * '''Untrusted''' relay at notrust.example.com [193.120.149.226]
 * '''Trusted''' relay at friend.example.com [212.17.35.14]
 * '''Trusted''' and '''internal''' relay at dmz.example.com [150.51.53.1]
 * '''Trusted''' and '''internal''' localhost handover at internal.example.com [127.0.0.1]

Here's what SpamAssassin's trust path algorithm makes of this (pasted from "spamassassin -D -t" output):

{{{
[16429] dbg: received-header: parsed as [ ip=127.0.0.1 rdns= helo=internal.example.com by=localhost ident= envfrom= intl=0 id= auth= ]
[16429] dbg: received-header: relay 127.0.0.1 trusted? yes internal? yes
[16429] dbg: dns: looking up PTR record for '150.51.53.1'
[16429] dbg: dns: PTR for '150.51.53.1': ''
[16429] dbg: received-header: parsed as [ ip=150.51.53.1 rdns= helo=dmz.example.com by=internal.example.com ident= envfrom= intl=0 id= auth= ]
[16429] dbg: received-header: relay 150.51.53.1 trusted? yes internal? yes
[16429] dbg: dns: looking up PTR record for '212.17.35.14'
[16429] dbg: dns: PTR for '212.17.35.14': ''
[16429] dbg: received-header: parsed as [ ip=212.17.35.14 rdns= helo=friend.example.com by=dmz.example.com ident= envfrom= intl=0 id= auth= ]
[16429] dbg: received-header: relay 212.17.35.14 trusted? yes internal? yes
[16429] dbg: dns: looking up PTR record for '193.120.149.226'
[16429] dbg: dns: PTR for '193.120.149.226': ''
[16429] dbg: received-header: parsed as [ ip=193.120.149.226 rdns= helo=notrust.example.com by=friend.example.com ident= envfrom= intl=0 id= auth= ]
[16429] dbg: received-header: relay 193.120.149.226 trusted? no internal? no
[16429] dbg: dns: looking up PTR record for '61.119.13.18'
[16429] dbg: dns: PTR for '61.119.13.18': ''
[16429] dbg: received-header: parsed as [ ip=61.119.13.18 rdns= helo=loser.example.org by=notrust.example.com ident= envfrom= intl=0 id= auth= ]
[16429] dbg: received-header: relay 61.119.13.18 trusted? no internal? no
[16429] dbg: dns: looking up PTR record for '210.73.88.134'
[16429] dbg: dns: PTR for '210.73.88.134': ''
[16429] dbg: received-header: parsed as [ ip=210.73.88.134 rdns= helo=chaos.example.net by=loser.example.org ident= envfrom= intl=0 id= auth= ]
[16429] dbg: received-header: relay 210.73.88.134 trusted? no internal? no
[16429] dbg: dns: looking up PTR record for '144.137.3.98'
[16429] dbg: dns: PTR for '144.137.3.98': ''
[16429] dbg: received-header: parsed as [ ip=144.137.3.98 rdns= helo=evil.example.net by=chaos.example.net ident= envfrom= intl=0 id= auth= ]
[16429] dbg: received-header: relay 144.137.3.98 trusted? no internal? no
}}}

The next two lines are most noteworthy -- especially since rules can match against these ''metadata pseudoheaders''.  Here they are:

the 'X-Spam-Relays-Trusted' pseudoheader:
{{{
[ ip=127.0.0.1 rdns= helo=internal.example.com by=localhost ident= envfrom= intl=1 id= auth= ]
[ ip=150.51.53.1 rdns= helo=dmz.example.com by=internal.example.com ident= envfrom= intl=1 id= auth= ]
[ ip=212.17.35.14 rdns= helo=friend.example.com by=dmz.example.com ident= envfrom= intl=0 id=auth= ]
}}}

the 'X-Spam-Relays-Untrusted' pseudoheader:
{{{
[ ip=193.120.149.226 rdns= helo=notrust.example.com by=friend.example.com ident= envfrom= intl=0 id= auth= ]
[ ip=61.119.13.18 rdns= helo=loser.example.org by=notrust.example.com ident= envfrom= intl=0 id= auth= ]
[ ip=210.73.88.134 rdns= helo=chaos.example.net by=loser.example.org ident= envfrom= intl=0 id= auth= ]
[ ip=144.137.3.98 rdns= helo=evil.example.net by=chaos.example.net ident= envfrom= intl=0 id= auth= ]
}}}

You can see they list the contents of the Received header in a machine-readable, and standardised, format, so that rules can be insulated from the vagaries of the Received header, which has a tendency to look radically different between MTAs.

Some sample rules that use this data can be seen in the standard SpamAssassin rules file, '20_fake_helo_tests.cf'.

== DNSBL lookups and 'firsttrusted' ==

DNSBL rules support '-firsttrusted' and '-untrusted' as special-case keywords to control IP address selection.  These keywords do NOT refer to the trust status of the lines themselves!  They refer to the trust status of the data that will be looked up in the DNSBL.

This hinges on a key border case.  The most recent 'untrusted' header line is in an interesting grey area -- the '''host''' it discusses is an untrusted host, but the '''data''' recorded about that host is, in itself, trustworthy.

Above, for example, 193.120.149.226 is listed as an untrusted host and is therefore listed in the 'X-Spam-Relays-Untrusted' pseudoheader.  However, its IP address was ''recorded'' by a trusted host, so the IP address ''is'' trustworthy.

This is the only 'grey area', however.  All the other hosts listed in the 'X-Spam-Relays-Untrusted' pseudoheader were both untrusted themselves, and their details were not recorded by a trusted host; both the lines themselves and the IP addresses are not trustworthy, since they could have been generated by a spamware application creating fake header data.