You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spamassassin.apache.org by Ben Lentz <bl...@channing-bete.com> on 2006/10/28 14:26:05 UTC

DKIM_VERIFIED Test

Greetings,
I am currently using spamassassin 3.1.7 with SnertSoft's milter-spamc 
1.10.376 and sendmail 8.13.8. I am having problems getting DKIM_VERIFIED 
to work, and after some trial and error, I compared the canonicalized 
files from a signer and verifier system, and found that the data being 
passed to SpamAssassin (and in turn to Plugin/DKIM.pm and Mail::DKIM) 
contained what appeared to be an extra CRLF between the headers and body.

This is almost a "normal" problem with DKIM, as it's sensitive (by 
design) to signs of tampering like this.

Applying a patch against the source for milter-spamc, removing what I 
believe is the line of code that's injecting this CRLF when the data is 
passed from libmitler to spamassassin:

--- milter-spamc.c.orig 2006-10-17 17:07:09.000000000 -0400
+++ milter-spamc.c 2006-10-17 17:26:22.000000000 -0400
@@ -649,7 +649,7 @@
if (data->work.skipMessage)
return SMFIS_CONTINUE;

- (void) BufAddBytes(data->headers, "\r\n", 0, 2);
+ /* (void) BufAddBytes(data->headers, "\r\n", 0, 2); */

/* Insert a simulated Received: header for this server into the
* header block being sent to spamd. It appears that this header

Results in SpamAssassin doing this:
X-Spam-Report: Content preview: test [...]
____
Content analysis details: (-0.1 points, 4.0 required)
____
pts rule name description
---- ---------------------- 
--------------------------------------------------
...snip...
2.5 MISSING_HB_SEP Missing blank line between message header and body
0.0 DKIM_POLICY_SIGNALL Domain Keys Identified Mail: policy says domain
signs all mails
-0.0 DKIM_VERIFIED Domain Keys Identified Mail: signature passes
verification
0.0 DKIM_SIGNED Domain Keys Identified Mail: message has a signature
0.0 DKIM_POLICY_TESTING Domain Keys Identified Mail: policy says domain
is testing DK

Which is a little unexpected... MISSING_HB_SEP & DKIM_VERIFIED at the 
same time.

However, I've also discovered that if I keep this line, but change the 
CRLF to a LF:

--- com/snert/src/milter-spamc/milter-spamc.c.orig    2006-10-18 
15:03:37.000000000 -0400
+++ com/snert/src/milter-spamc/milter-spamc.c    2006-10-18 
15:03:49.000000000 -0400
@@ -649,7 +649,7 @@
     if (data->work.skipMessage)
         return SMFIS_CONTINUE;
 
-    (void) BufAddBytes(data->headers, "\r\n", 0, 2);
+    (void) BufAddBytes(data->headers, "\n", 0, 1);
 
     /* Insert a simulated Received: header for this server into the
      * header block being sent to spamd. It appears that this header

Everything appears to work, MISSING_HB_SEP goes away, and DKIM_VERIFIED 
works for signed mail.

However, after providing all this information to Anthony Howe, developer 
of milter-spamc he's responded with:
> I'm going to reject this patch on the grounds that I claim the DKIM 
> test in SpamAssassin is wrong. RFC 2822 line endings for ALL headers, 
> body lines, and the blank line separating the two are CRLF, not LF. 
> Since the first call to filterBody() when processing a message does 
> NOT contain the CRLF that separates the headers from the body, that 
> CRLF is correctly added to the buffer when filterEndHeaders() is 
> called. Its SpamAssassin that would appear not to be correctly 
> handling message newlines breaking on LF instead of CRLF.
The part of his theory that doesn't sit right with me is that if I 
re-scan the email that's handed back to sendmail after milter-spamc 
verifies it, DKIM_VERIFIED works fine. It appears that whatever 
milter-spamc is handing to SpamAssassin won't DKIM_VERIFy, but that the 
final output of milter-spamc will DKIM_VERIFy. To me, this is an 
indication of munge when handing off to SpamAssassin, and a removal of 
that munge when handed back to sendmail. I know I've just repeated the 
same point three times, but I'm trying to articulate it correctly.

I'd like to help debug and/or troubleshoot this issue, even though I 
feel my patch to milter-spamc resolves the issue for my systems.

Any thoughts would be greatly appreciated. If you feel it'd be more 
appropriate to post to the users list, I'll do that instead, but I 
thought that this issue might be more dev-y.

Thanks in advance.

Re: DKIM_VERIFIED Test

Posted by Ben Lentz <bl...@channing-bete.com>.

I can't figure out why all my systems, all running sendmail 8.13.8, the 
best MTA in the world (TM), are not transmitting RFC2822 compliant email 
over the wire. *That's* what makes my head spin. If it were MS Exchange, 
I could see it.

Below is Anthony's response.
> Ben Lentz wrote, On 29/10/06 1:26 AM:
>   
>> However, after providing all this information to Anthony Howe, developer
>> of milter-spamc he's responded with:
>>     
>>> I'm going to reject this patch on the grounds that I claim the DKIM
>>> test in SpamAssassin is wrong. RFC 2822 line endings for ALL headers,
>>> body lines, and the blank line separating the two are CRLF, not LF.
>>>       
>
> The problem with this line of reasoning, and I believe the reason why we
> ended up with the practical solution of looking at the line endings of
> the first lines and using what we find for the rest, is that RFC2822
> applies only to the mail as it is sent between computers to an MTA. We
> found that we could not count on the line endings conforming to RFC2822
> at the time that it is sent to SpamAssassin.
>
> To quote from RFC2822:
>
>  This specification is intended as a definition of what message
>  content format is to be passed between systems. Though some message
>  systems locally store messages in this format (which eliminates the
>  need for translation between formats) and others use formats that
>  differ from the one specified in this standard, local storage is
>  outside of the scope of this standard.
>
> We ran into problems when we did anything other than decide on what line
> ending format the message was using and then use that when we add headers.
>
> It seems to me that milter-spamc is making the same mistake that we did,
> which is to assume that it is always ok to add a header in RFC2822
> format. As long as it is not acting as a filter of mail in transit to an
> MTA, then it cannot rely on RFC2822. In practice, we see mail systems
> that internally use RFC2822 format _except_ for using the newline
> convention of the local OS, only taking care of that aspect of RFC2822
> when the mail is sent out to or received from other MTAs. Absent a
> standard, all we can do is figure out which newline is being used and do
> the same with any headers that are added.
>
>  -- sidney
>   
Anthony wrote:

This is a variant of what I mentioned previously about Unix newlines 
found in saved files vs. the newlines used by RFC 2822. This is an 
artifact of how lines are often read in stripping CRLF then writen to a 
file adding back a LF. This is a common mistake by mail app. 
implementors who might see it as unimportant (I'm not referring to SA 
dev here, just history). As an aside Mark Crispin, author of UW-IMAP, 
said this caused so much problems that newer versions of his IMAP 
software now always save the mail to the mailbox folder using CRLF.

The SA spamd protocol document from 2.5 (the original document used when 
milter-spamc was first written) did not specify whether the client 
communications should use CRLF or just LF on the protocol's own headers; 
only the end of header mark. But there is no mention how the mail 
headers and content should be sent to spamd, therefore the assumption 
has always been "as seen off the wire".

Later versions of the spamd protocol document from 3.0, and 3.1 are a 
little more clear concerning the newlines used for spamc client headers, 
but get wishy-washy about the spamd response headers varying between 
CRLF and LF. And in neither document do they state what form the mail 
content passed should take, ie. "as seen off the wire" or normalised to 
using CRLF or LF or hell why not Berstien Strings (and avoid CRLF v LF 
issues).

Again I stuck with my original choice of maintaining RFC 2822 newlines, 
since this avoids unnecessary translation, is consist with mail protocol 
standards, allows for saving the message in form that could later be 
reintroduced into the mail system, and has worked with SA up until DKIM.

I would suggest that SA update the spamd protocol document to be more 
precise as to what it wants to see at every stage of the protocol right 
down to newline format as this would aid implementors.

Its not a mistake to assume RFC 2822 line endings, its the standard. 
That other mail MUA/MTA developers have choosen to be careless with it 
such that we have to dumb down our products for the mistakes of others.

I've considered doing as SA suggests, using some limited look ahead in 
the first body chunk to determine newline type, but the milter API is 
linear such that this information comes after the headers have been 
given to the milter, sans CRLF or exact white space between heder-colon 
and the value, already be placed in a buffer using CRLF. It gets messy 
having to hold that information until the body chunks arrive.
It feels inherently wrong.

I would like to know why the CRLF header separator is treated as part of 
the message body by SA and not the header section? I send all the 
message headers using CRLF and the separator as CRLF, then I send the 
message body chunks exactly as sendmail provided them to the milter, see 
milter API doc for xxfi_body hook:

http://www.milter.org/milter_api/xxfi_body.html

It states the body chunks _should_ have RFC 2822 CRLF newlines, though 
it may have arrived as LF (grr).

Doing as SA suggests, using the newlines as found in the message body, 
will break one day when some poorly written mail app. send headers & 
separator with CRLF and a message body using LF or worse visa versa 
headers with LF and body with CRLF.

Essentially to avoid the newline issue, the DKIM spec and their 
implementations should
not be signing newlines.

---

I would argue that SpamAssassin should correct their implementation to 
use two different newlines types, those of the headers and separator, 
followed by those for the body after the header section and CRLF.

---

I'd also be wondering how SpamAssassin CLI handles DKIM on Windows where 
their newlines are CRLF.

So many issues make my head spin.

Re: DKIM_VERIFIED Test

Posted by Sidney Markowitz <si...@sidney.com>.

Ben Lentz wrote, On 29/10/06 3:49 PM:
> I guess the real way to fix this is to try and detect the header
> delimiter from the first few lines in the message and apply that same
> delimiter later on?
> 
> Does anyone have any tips on helping me convince Anthony that this
> problem exists?

I think that both problems - how to fix it and how to convince Anthony -
are simplified by one big difference between SpamAssassin and
milter-spamc. We have to handle mail that may be extracted at any stage
of going from one MUA to an MTA to another MTA to another MUA, on any
platform. We make some assumptions about compliance to RFC2822, but that
only works to the degree that various MUAs and MTAs happen to use
something close to RFC2822 for their local format. If that means that we
have uncertainty about the newline being used in any given installation,
we have to deal with that when adding headers. That's why we can't tell
what newline to use without looking at some of the newlines that are
already there.

On the other hand, milter-spamc is a sendmail milter, which means it
only has to deal with the realities of newlines in the headers that are
given to milters by sendmail, and can assume a unix-like platform.

I notice that documentation for the sendmail milter API says for the
function smfi_addheader in
http://www.sendmail.org/doc/sendmail-current/libmilter/docs/smfi_addheader.html

 To make a multi-line header, insert a line feed (ASCII 0x0a,
 or \n in C) followed by at least one whitespace character such
 as a space (ASCII 0x20) or tab (ASCII 0x09, or \t in C). The
 line feed should NOT be preceded by a carriage return (ASCII 0x0d);
 the MTA will add this automatically.

That's just an indication, but here is what I think the strongest
argument for Anthony is: Take a look at the newlines in the headers that
milter-spamc is receiving from sendmail. If they are unix-style
newlines, then that is proof that sendmail is using unix newlines
instead of RFC2822 newlines at that stage of the processing, and you
have to do the same to keep the results consistent.

 -- sidney

Re: DKIM_VERIFIED Test

Posted by Ben Lentz <bl...@channing-bete.com>.

> Ben Lentz wrote, On 29/10/06 1:26 AM:
>   
>> However, after providing all this information to Anthony Howe, developer
>> of milter-spamc he's responded with:
>>     
>>> I'm going to reject this patch on the grounds that I claim the DKIM
>>> test in SpamAssassin is wrong. RFC 2822 line endings for ALL headers,
>>> body lines, and the blank line separating the two are CRLF, not LF.
>>>       
>
> The problem with this line of reasoning, and I believe the reason why we
> ended up with the practical solution of looking at the line endings of
> the first lines and using what we find for the rest, is that RFC2822
> applies only to the mail as it is sent between computers to an MTA. We
> found that we could not count on the line endings conforming to RFC2822
> at the time that it is sent to SpamAssassin.
>
> To quote from RFC2822:
>
>  This specification is intended as a definition of what message
>  content format is to be passed between systems. Though some message
>  systems locally store messages in this format (which eliminates the
>  need for translation between formats) and others use formats that
>  differ from the one specified in this standard, local storage is
>  outside of the scope of this standard.
>
> We ran into problems when we did anything other than decide on what line
> ending format the message was using and then use that when we add headers.
>
> It seems to me that milter-spamc is making the same mistake that we did,
> which is to assume that it is always ok to add a header in RFC2822
> format. As long as it is not acting as a filter of mail in transit to an
> MTA, then it cannot rely on RFC2822. In practice, we see mail systems
> that internally use RFC2822 format _except_ for using the newline
> convention of the local OS, only taking care of that aspect of RFC2822
> when the mail is sent out to or received from other MTAs. Absent a
> standard, all we can do is figure out which newline is being used and do
> the same with any headers that are added.
>
>  -- sidney
>   
After examining a tcpdump of the SMTP transaction between my systems 
(sendmail 8.13.8 - 8.13.8), I can confirm that the header delimiter is, 
in fact, only 0x0a. I was being thrown off by looking at the delivered 
message in an MUA, which, when saved, is in 0x0d 0x0a. So, you're 
totally right.

When milter-spamc is running with this code (broken):
(void) BufAddBytes(data->headers, "\r\n", 0, 2);

The header-body canonicalization data from Mail::DKIM is:
00000400  74 79 3a 58 2d 4d 61 69  6c 65 72 3a 20 58 2d 4d  
|ty:X-Mailer: X-M|
00000410  69 6d 65 4f 4c 45 3b 20  62 3d 0d 0a 74 65 73 74  |imeOLE; 
b=..test|
00000420  0d 0a                                             |..|

and I don't get DKIM_VERIFIED.

When milter-spamc is running with this code ("fixed"):
(void) BufAddBytes(data->headers, "\n", 0, 1);

The header-body canonicalization data from Mail::DKIM is:
00000400  74 79 3a 58 2d 4d 61 69  6c 65 72 3a 20 58 2d 4d  
|ty:X-Mailer: X-M|
00000410  69 6d 65 4f 4c 45 3b 20  62 3d 74 65 73 74 0d 0a  |imeOLE; 
b=test..|
00000420

and I do get DKIM_VERIFIED. It's a little disconcerting that the 
header-body separator is completely missing, but it's really not 
missing; I can tell because I don't get MISSING_HB_SEP.

I guess the real way to fix this is to try and detect the header 
delimiter from the first few lines in the message and apply that same 
delimiter later on?

Does anyone have any tips on helping me convince Anthony that this 
problem exists?

Re: DKIM_VERIFIED Test

Posted by Sidney Markowitz <si...@sidney.com>.

Ben Lentz wrote, On 29/10/06 1:26 AM:
> However, after providing all this information to Anthony Howe, developer
> of milter-spamc he's responded with:
>> I'm going to reject this patch on the grounds that I claim the DKIM
>> test in SpamAssassin is wrong. RFC 2822 line endings for ALL headers,
>> body lines, and the blank line separating the two are CRLF, not LF.

The problem with this line of reasoning, and I believe the reason why we
ended up with the practical solution of looking at the line endings of
the first lines and using what we find for the rest, is that RFC2822
applies only to the mail as it is sent between computers to an MTA. We
found that we could not count on the line endings conforming to RFC2822
at the time that it is sent to SpamAssassin.

To quote from RFC2822:

 This specification is intended as a definition of what message
 content format is to be passed between systems. Though some message
 systems locally store messages in this format (which eliminates the
 need for translation between formats) and others use formats that
 differ from the one specified in this standard, local storage is
 outside of the scope of this standard.

We ran into problems when we did anything other than decide on what line
ending format the message was using and then use that when we add headers.

It seems to me that milter-spamc is making the same mistake that we did,
which is to assume that it is always ok to add a header in RFC2822
format. As long as it is not acting as a filter of mail in transit to an
MTA, then it cannot rely on RFC2822. In practice, we see mail systems
that internally use RFC2822 format _except_ for using the newline
convention of the local OS, only taking care of that aspect of RFC2822
when the mail is sent out to or received from other MTAs. Absent a
standard, all we can do is figure out which newline is being used and do
the same with any headers that are added.

 -- sidney