You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Bret Miller <br...@wcg.org> on 2007/08/31 20:50:15 UTC

Parsing Received Headers

I'm trying to get received headers to parse correctly because the ones from
CommuniGate Pro don't always. And, since I'm already modifying the headers
in my connector due to the MTA not being able to do RDNS without rejecting
based on it, I'm not aware that certain types of headers don't parse
correctly. My current problem is this one:

Received: from [206.74.184.2] (HELO [206.74.184.2])
	 by mail.wcg.org (CommuniGate Pro SMTP 5.1.11)
	 with ESMTP id 22363646
	 for xxxx@wcg.org; Fri, 31 Aug 2007 10:32:08 -0700

Which is unmodified except for the obscuring of the email address. My RDNS
lookup was modifying the header to read:

Received: from [206.74.184.2] (HELO [206.74.184.2]) (206.74.184.2)
	 by mail.wcg.org (CommuniGate Pro SMTP 5.1.11)
	 with ESMTP id 22363646
	 for ken.williams@wcg.org; Fri, 31 Aug 2007 10:32:08 -0700

Meaning that there was no RDNS for 206.74.184.2 and when it said helo, it
said "HELO [206.74.184.2]". However, SA is not parsing it that way. So, can
anyone tell me how to write the received header so SA understands it?

How do I know it's not parsing correctly? Debug log:

[-2240] dbg: received-header: parsed as [ ip=206.74.184.2 rdns=HELO
helo=!206.74.184.2! by=mail.wcg.org ident= envfrom= intl=0 id=22363646 auth=
msa=0 ]
[-2240] dbg: received-header: relay 206.74.184.2 trusted? no internal? no
msa? no
[-2240] dbg: metadata: X-Spam-Relays-Trusted: 
[-2240] dbg: metadata: X-Spam-Relays-Untrusted: [ ip=206.74.184.2 rdns=HELO
helo=!206.74.184.2! by=mail.wcg.org ident= envfrom= intl=0 id=22363646 auth=
msa=0 ]
[-2240] dbg: metadata: X-Spam-Relays-Internal: 
[-2240] dbg: metadata: X-Spam-Relays-External: [ ip=206.74.184.2 rdns=HELO
helo=!206.74.184.2! by=mail.wcg.org ident= envfrom= intl=0 id=22363646 auth=
msa=0 ]
[-2240] dbg: metadata: X-Relay-Countries: US

Obviously, the RDNS wasn't "HELO". 

Or perhaps I should just open a bug ticket to fix SA's "not understanding"
problem...

Bret

Re: Parsing Received Headers

Posted by Thomas Kishel <to...@darkhorse.com>.
Bret,


Bret Miller wrote:
> 
> Or perhaps I should just open a bug ticket to fix SA's "not understanding"
> problem...
> 

(Also posted to CGP mailing list) 

If you are receiving false-positives with CGP and the SpamAssassin 3.2.x
RDNS_NONE test ...

If SpamAssassin 3.1.x cannot identify RDNS data in a "Received: from" header
(due to formatting or omission) it would perform a RDNS lookup itself. That
functionality has been removed from SpamAssassin 3.2.x as per:

    http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5054

The author comments: "we can move that lookup out to the eval test that uses
it, pretty easily", but the RDNS_NONE test (among others) in 20_dynrdns.cf
(among others) continues to just parse the X-Spam-Relays-Untrusted header
set in SpamAssassin/Message/Metadata/Received.pm. You can re-enable that
feature using the following patch.

80,83d79
<   # TJK Restore SA RDNS Resolution for CGP.
<   $self->{permsgstatus} = $permsgstatus;
<   $self->{is_dns_available} = $self->{permsgstatus}->is_dns_available();
<
1249,1258c1245
<       # TJK Restore SA RDNS Resolution for CGP.
<       if ($self->{is_dns_available}) {
<         $rdns = $self->{permsgstatus}->lookup_ptr($ip);
<         if (! $rdns) {
<           $rdns eq '';
<           $relay->{rdns_not_in_headers} = 1
<         }
<       } else {
<         $relay->{rdns_not_in_headers} = 1;
<       }
---
>       $relay->{rdns_not_in_headers} = 1;

Note that the "verified" flag that CGP sets in the "Received: from" header
denotes the status of the HELO command, not the RDNS of the connecting host.

---

Example:

Single sending host with an IP address of 123.456.789.200.

DNS:

name-x.source.com A 123.456.789.100
name-y.source.com A 123.456.789.200
name-z.source.com A 123.456.789.300

Reverse DNS:

123.456.789.100 PTR name-x.source.com
123.456.789.200 PTR name-z.source.com
123.456.789.300 PTR name-z.source.com

telnet cgp.destination.com 25
HELO 123.456.789.100
Received: from [123.456.789.200] (HELO 123.456.789.100) by
cgp.destination.com
# unverified HELO: 123.456.789.100 communicated from 123.456.789.200

telnet cgp.destination.com 25
HELO name-x.source.com
Received: from [123.456.789.200] (HELO nameof-123.456.789.101.com) by
cgp.destination.com
# unverified HELO: name-x.source.com aka 123.456.789.100 communicated from
123.456.789.200

telnet cgp.destination.com 25
HELO name-y.source.com
Received: from name-y.source.com ([123.456.789.200] verified) by
cgp.destination.com
# verified HELO: name-y.source.com aka 123.456.789.200 communicated from
123.456.789.200
# but reverse of 123.456.789.200 is name-z.source.com

--

Tom Kishel
Dark Horse Comics

-- 
View this message in context: http://www.nabble.com/Parsing-Received-Headers-tf4361839.html#a12827592
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Parsing Received Headers

Posted by John Rudd <jr...@ucsc.edu>.
> Bret Miller wrote:
>> Received: from [206.74.184.2] (HELO [206.74.184.2])
>> 	 by mail.wcg.org (CommuniGate Pro SMTP 5.1.11)
>> ...
>> Meaning that there was no RDNS for 206.74.184.2 

Actually, CommuniGate sometimes does that even when RDNS _is_ available.

For example:

Received: from [128.114.125.44] (HELO smtp-prod-mx2.ucsc.edu)
   by copper.ucsc.edu (CommuniGate Pro SMTP 5.1.7)


128.114.125.44 is one of my own machines, and it DEFINITELY has RDNS.

I have yet to figure out what all of the different predictable formats 
are for CommuniGate's Received headers.  It is, unfortunately, something 
that CommuniGate hasn't seen fit to formalize and publish.

RE: Parsing Received Headers

Posted by Bret Miller <br...@wcg.org>.
> > I'm trying to get received headers to parse correctly 
> because the ones from
> > CommuniGate Pro don't always. And, since I'm already 
> modifying the headers
> > in my connector due to the MTA not being able to do RDNS 
> without rejecting
> > based on it, I'm not aware that certain types of headers don't parse
> > correctly. My current problem is this one:
> > ...
> > My RDNS lookup was modifying the header to read:
> 
> Since you are already fixing broken Received header fields,
> I suggest you do it by the book. The syntax is prescribed
> by RFC 2821 (4.4 Trace Information):
> 
> ...
>    This line MUST be structured as follows:
> 
>    -  The FROM field, which MUST be supplied in an SMTP environment,
>       SHOULD contain both (1) the name of the source host as presented
>       in the EHLO command and (2) an address literal containing the IP
>       address of the source, determined from the TCP connection.
> ...
> 
> From-domain = "FROM" FWS Extended-Domain CFWS
> 
> Extended-Domain = Domain /
>            ( Domain FWS "(" TCP-info ")" ) /
>            ( Address-literal FWS "(" TCP-info ")" )
> 
> TCP-info = Address-literal / ( Domain FWS Address-literal )
>       ; Information derived by server from TCP connection
>       ; not client EHLO.
> 
> Domain = (sub-domain 1*("." sub-domain)) / address-literal

As for reporting this to the CommuniGate people, I doubt they have any
interest in fixing it. After all, they still use the domain name instead of
the machine name for their own EHLO/HELO command and provide no way of
overriding it for RFC compliance. We got around it by (against their
recommendation) licensing our copy to the machine instead of the domain.

Anyway, the above doesn't make any more sense to me than reading examples in
the mail I receive. So far, I haven't come up with a format that works for
SA. So, please correct:

HELO bretspc, IP 192.168.1.125, RDNS bretspc.example.com
Received: from bretspc (bretspc.example.com 192.168.1.125)...

HELO [192.168.1.125], IP 192.168.1.125, RDNS none
Received: from [192.168.1.125] (unknown 192.168.1.125)...

HELO 192.168.1.125, IP 192.168.1.125, RDNS 192.168.1.125 (yeah, I've seen
ones like this)
Received: from 192.168.1.125 (192.168.1.125 192.168.1.125)...

And then there's the matter of adding whether the sender was authenticated,
and what was supplied as "mail from". 

Perhaps the better way to do this would be to fix SA to read the CGPro
headers, do it's own RDNS lookup if necessary. The problem is that not all
the information is available to SA at that point, so I have to supply some
of it, and I suppose there would be concerns as to whether SA should be
doing the RDNS lookup itself too.

Maybe a plugin? But can a plugin get control early enough to re-write the
received header info so that it's correct for all the other places in SA it
gets used? 

So I guess my choices are there-- rewrite the received header to make it
readable, patch SA to read the information correct (this doesn't solve my
missing RDNS info problem unless I add the lookup to SA too), or add a
plugin if it's possible to do what needs to be done with it.

Honestly, rewriting the header is probably the easiest, which is why I chose
to do that. Now it's just a matter of rewriting it so that SA can actually
read it properly. I guess another problem is that I might have to say I'm
NOT running CommuniGate Pro so that SA doesn't try it's custom code on it...

Bret

Re: Parsing Received Headers

Posted by Mark Martinec <Ma...@ijs.si>.
Bret,

> I'm trying to get received headers to parse correctly because the ones from
> CommuniGate Pro don't always. And, since I'm already modifying the headers
> in my connector due to the MTA not being able to do RDNS without rejecting
> based on it, I'm not aware that certain types of headers don't parse
> correctly. My current problem is this one:
> ...
> My RDNS lookup was modifying the header to read:

Since you are already fixing broken Received header fields,
I suggest you do it by the book. The syntax is prescribed
by RFC 2821 (4.4 Trace Information):

...
   This line MUST be structured as follows:

   -  The FROM field, which MUST be supplied in an SMTP environment,
      SHOULD contain both (1) the name of the source host as presented
      in the EHLO command and (2) an address literal containing the IP
      address of the source, determined from the TCP connection.
...

From-domain = "FROM" FWS Extended-Domain CFWS

Extended-Domain = Domain /
           ( Domain FWS "(" TCP-info ")" ) /
           ( Address-literal FWS "(" TCP-info ")" )

TCP-info = Address-literal / ( Domain FWS Address-literal )
      ; Information derived by server from TCP connection
      ; not client EHLO.

Domain = (sub-domain 1*("." sub-domain)) / address-literal


John Rudd writes:
> That's correct.  CommuniGate puts the DNS based information before the
> parentheses, and puts non-DNS based information inside the parentheses
> (for authenticated email, it puts the authenticated user account instead
> of the HELO info, for non-authenticated it puts the HELO string, but
> there's also a third case which I'm not recalling at the moment).

Someone with a CommuniGate maintenance contract should open a bug report.
They are implementing a SMTP-based mailer and did not care to read the
basic RFC.

  Mark

Re: Parsing Received Headers

Posted by John Rudd <jr...@ucsc.edu>.
Matus UHLAR - fantomas wrote:
>> Bret Miller wrote:
>>> Received: from [206.74.184.2] (HELO [206.74.184.2])
>>> 	 by mail.wcg.org (CommuniGate Pro SMTP 5.1.11)
>>> ...
>>> Meaning that there was no RDNS for 206.74.184.2 and when it said helo, it
>>> said "HELO [206.74.184.2]". However, SA is not parsing it that way. So, can
>>> anyone tell me how to write the received header so SA understands it?
> 
> On 01.09.07 13:41, Bob Proulx wrote:
>> Hmm...  I think if you replace "HELO" with "unknown" that it might
>> parse.  I did not look at the SA parsing rules but a typical MTA would
>> have placed "unknown" there instead of the HELO. 
> 
> as far as I understand it, the string in parentheses is the helo string and
> that's why is HELO there. Typical MTA places helo string before parentheses
> and RDNS and/or IP between them. Seems this is the opposite case.
> 

That's correct.  CommuniGate puts the DNS based information before the 
parentheses, and puts non-DNS based information inside the parentheses 
(for authenticated email, it puts the authenticated user account instead 
of the HELO info, for non-authenticated it puts the HELO string, but 
there's also a third case which I'm not recalling at the moment).

Re: Parsing Received Headers

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
> Bret Miller wrote:
> > Received: from [206.74.184.2] (HELO [206.74.184.2])
> > 	 by mail.wcg.org (CommuniGate Pro SMTP 5.1.11)
> > ...
> > Meaning that there was no RDNS for 206.74.184.2 and when it said helo, it
> > said "HELO [206.74.184.2]". However, SA is not parsing it that way. So, can
> > anyone tell me how to write the received header so SA understands it?

On 01.09.07 13:41, Bob Proulx wrote:
> Hmm...  I think if you replace "HELO" with "unknown" that it might
> parse.  I did not look at the SA parsing rules but a typical MTA would
> have placed "unknown" there instead of the HELO. 

as far as I understand it, the string in parentheses is the helo string and
that's why is HELO there. Typical MTA places helo string before parentheses
and RDNS and/or IP between them. Seems this is the opposite case.

>  I would try that.  A
> typical MTA would place an unresolving line such as this following:
> 
>   Received: from localhost.localdomain (unknown [127.0.0.1])
> 
> The part after the "from" would be the verbatim string from the HELO
> exchange.  It can't really be trusted.  But the RDNS part as you
> observed should not be HELO.
-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
If Barbie is so popular, why do you have to buy her friends? 

Re: Parsing Received Headers

Posted by Bob Proulx <bo...@proulx.com>.
Bret Miller wrote:
> Received: from [206.74.184.2] (HELO [206.74.184.2])
> 	 by mail.wcg.org (CommuniGate Pro SMTP 5.1.11)
> ...
> Meaning that there was no RDNS for 206.74.184.2 and when it said helo, it
> said "HELO [206.74.184.2]". However, SA is not parsing it that way. So, can
> anyone tell me how to write the received header so SA understands it?

Hmm...  I think if you replace "HELO" with "unknown" that it might
parse.  I did not look at the SA parsing rules but a typical MTA would
have placed "unknown" there instead of the HELO.  I would try that.  A
typical MTA would place an unresolving line such as this following:

  Received: from localhost.localdomain (unknown [127.0.0.1])

The part after the "from" would be the verbatim string from the HELO
exchange.  It can't really be trusted.  But the RDNS part as you
observed should not be HELO.

Just guessing here...

Bob