You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2006/11/12 12:33:09 UTC

[Bug 5179] New: Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179

           Summary: Ambiguity in spamd protocol for line endings can break
                    DKIM_VERIFIED test
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P5
         Component: spamc/spamd
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: sidney@sidney.com


I don't know if this is a SpamAssassin bug, so I'm opening this ticket to
provide a base for discussion to find out if it is. The discussion started on
the dev mailing list. I'm going to paste excerpts from that thread in the next
few comments and set the cc list of this bug report to some people who
participated in it.

This may end up being a bug in an outside program called milter-spamc, or it may
be that there is an ambiguity in the spamd protocol or that there is an edge
case that spamd should handle. I'm moving the discussion from the dev mailing
list to here to make sure that the question doesn't get dropped before it is
settled one way or the other.

Details to follow in comments.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |
            Summary|[review] Ambiguity in spamd |Ambiguity in spamd protocol
                   |protocol for line endings   |for line endings can break
                   |can break DKIM_VERIFIED test|DKIM_VERIFIED test
   Target Milestone|3.1.8                       |3.2.0




------- Additional Comments From jm@jmason.org  2007-02-15 03:55 -------
ick.  reopening, thanks Mark



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From achowe@snert.com  2006-11-12 12:23 -------
As mentioned in previous email by myself, the CRLF header/body separator is part
of the headers section and a protocol element, not part of the body. If the
separator were part of the body, you would see an extra blank line displayed by
your MUA. The sendmail milter API fires the xxfi_eoh handler on the CRLF
header/body separator, but does _not_ pass it to the milter. When sendmail
passes the first body chunk to the milter, its _without_ the CRLF used to denote
the end of headers.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 21:14 -------
I tried the really simple test case of the mail being in all DOS format
newlines. DKIM_VERIFIER fails there too. So there is nothing about the header
and body newlines being different involved here.

I see the bug. In DKIM.pm, the return from get_pristine_header is split up into
an array of lines using split(/\n/s, $header)

When the line endings are \r\n that produces an extra array element at the end
consisting of the string '\r'. This will fail whenever the line separator in the
header is '\r\n', even when the entire message uses RFC2822 newlines.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


spamassassin@dostech.ca changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|needs 2 votes for 3.1 branch|needs 1 vote




------- Additional Comments From spamassassin@dostech.ca  2006-12-08 11:53 -------
+1, works for me



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From achowe@snert.com  2006-11-12 23:18 -------
Very good news Sidney.

To address Ben's last comment: I didn't think you could have a split signing
mode in DKIM, it was simple or relaxed for the whole message. I'm almost certain
simple mode would fail in some cases with almost any milter, because sendmail
milter API passes the header name and values separately and line unfolded. I'm
pretty sure that simple mode applied to headers reconstituted by any milter
would fail for any header that spanned more than one line.

This supplemental Internet Draft on DKIM concerning Sender Signing Practices 
http://www.ietf.org/internet-drafts/draft-allman-dkim-ssp-02.txt however fails
to recommend a white space algorithm to use. I would hazard a guess, given this
trouble ticket as an indicator, that signing with anything but relaxed
throughout a message may pose trouble at the receiving end.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From jm@jmason.org  2007-02-15 10:59 -------
ok -- 

: jm 256...; svn diff lib/Mail/SpamAssassin/Plugin/DKIM.pm
Index: lib/Mail/SpamAssassin/Plugin/DKIM.pm
===================================================================
--- lib/Mail/SpamAssassin/Plugin/DKIM.pm        (revision 507614)
+++ lib/Mail/SpamAssassin/Plugin/DKIM.pm        (working copy)
@@ -252,9 +252,11 @@

   # feed content of message into verifier, using \r\n endings,
   # required by Mail::DKIM API (see bug 5300)
+  # note: bug 5179 comment 28: perl does silly things unless we use \015\012
+  # instead of \r\n
   eval {
     foreach my $line (split(/\n/s, $scan->{msg}->get_pristine)) {
-      $line =~ s/\r?$/\r\n/s;       # ensure \r\n ending
+      $line =~ s/\r?$/\015\012/s;       # ensure \015\012 ending
       $message->PRINT($line);
     }
   };
: exit=0 Thu Feb 15 18:58:16 GMT 2007; cd /home/jm/ftp/spamassassin
: jm 257...; svn commit -m "bug 5179: perl does silly things with \\r\\n line
endings on non-UNIX platforms.  use \\015\\012 instead"
lib/Mail/SpamAssassin/Plugin/DKIM.pm
Sending        lib/Mail/SpamAssassin/Plugin/DKIM.pm
Transmitting file data .
Committed revision 508076.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From blentz@channing-bete.com  2006-12-08 12:11 -------
I voted on 11/13, so that's two. Does it need two or three?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


sidney@sidney.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
  Status Whiteboard|go ahead                    |




------- Additional Comments From sidney@sidney.com  2006-12-08 13:31 -------
Committed to 3.1 branch as revision 484786




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 03:50 -------
Another email from Ben Lentz that quotes the intervening messages which I'm
therefore not pasting in separate comments here:


> Ben Lentz wrote, On 29/10/06 1:26 AM:
>  
>> However, after providing all this information to Anthony Howe, developer
>> of milter-spamc he's responded with:
>>    
>>> I'm going to reject this patch on the grounds that I claim the DKIM
>>> test in SpamAssassin is wrong. RFC 2822 line endings for ALL headers,
>>> body lines, and the blank line separating the two are CRLF, not LF.
>>>       
>
> The problem with this line of reasoning, and I believe the reason why we
> ended up with the practical solution of looking at the line endings of
> the first lines and using what we find for the rest, is that RFC2822
> applies only to the mail as it is sent between computers to an MTA. We
> found that we could not count on the line endings conforming to RFC2822
> at the time that it is sent to SpamAssassin.
>
> To quote from RFC2822:
>
>  This specification is intended as a definition of what message
>  content format is to be passed between systems. Though some message
>  systems locally store messages in this format (which eliminates the
>  need for translation between formats) and others use formats that
>  differ from the one specified in this standard, local storage is
>  outside of the scope of this standard.
>
> We ran into problems when we did anything other than decide on what line
> ending format the message was using and then use that when we add headers.
>
> It seems to me that milter-spamc is making the same mistake that we did,
> which is to assume that it is always ok to add a header in RFC2822
> format. As long as it is not acting as a filter of mail in transit to an
> MTA, then it cannot rely on RFC2822. In practice, we see mail systems
> that internally use RFC2822 format _except_ for using the newline
> convention of the local OS, only taking care of that aspect of RFC2822
> when the mail is sent out to or received from other MTAs. Absent a
> standard, all we can do is figure out which newline is being used and do
> the same with any headers that are added.
>
>  -- sidney
>   
After examining a tcpdump of the SMTP transaction between my systems (sendmail
8.13.8 - 8.13.8), I can confirm that the header delimiter is, in fact, only
0x0a. I was being thrown off by looking at the delivered message in an MUA,
which, when saved, is in 0x0d 0x0a. So, you're totally right.

When milter-spamc is running with this code (broken):
(void) BufAddBytes(data->headers, "\r\n", 0, 2);

The header-body canonicalization data from Mail::DKIM is:
00000400  74 79 3a 58 2d 4d 61 69  6c 65 72 3a 20 58 2d 4d  |ty:X-Mailer: X-M|
00000410  69 6d 65 4f 4c 45 3b 20  62 3d 0d 0a 74 65 73 74  |imeOLE; b=..test|
00000420  0d 0a                                             |..|

and I don't get DKIM_VERIFIED.

When milter-spamc is running with this code ("fixed"):
(void) BufAddBytes(data->headers, "\n", 0, 1);

The header-body canonicalization data from Mail::DKIM is:
00000400  74 79 3a 58 2d 4d 61 69  6c 65 72 3a 20 58 2d 4d  |ty:X-Mailer: X-M|
00000410  69 6d 65 4f 4c 45 3b 20  62 3d 74 65 73 74 0d 0a  |imeOLE; b=test..|
00000420

and I do get DKIM_VERIFIED. It's a little disconcerting that the header-body
separator is completely missing, but it's really not missing; I can tell because
I don't get MISSING_HB_SEP.

I guess the real way to fix this is to try and detect the header delimiter from
the first few lines in the message and apply that same delimiter later on?

Does anyone have any tips on helping me convince Anthony that this problem exists?






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


achowe@snert.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |achowe@snert.com






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From Mark.Martinec@ijs.si  2007-02-15 11:28 -------
> -      $line =~ s/\r?$/\r\n/s;       # ensure \r\n ending
> +      $line =~ s/\r?$/\015\012/s;       # ensure \015\012 ending

Not to forget the two other cases of \r\n
few lines further down. Perhaps something like:

  # headers, line-by-line with \r\n endings, as per Mail::DKIM API
  # split lines, deleting endings and final empty line
  foreach my $line (split(/\r?\n/s, $header)) {
    $message->PRINT($line."\015\012");  # ensure true CR LF endings
  }
  $message->PRINT("\015\012");

  # body, line-by-line with CR LF endings
  eval {
    foreach my $line (@{$body}) {
      $line =~ s/\r?\n$/\015\012/s;   # ensure true CR LF endings
      $message->PRINT($line);
    }
  };




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


felicity@apache.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|needs 1 vote                |go ahead




------- Additional Comments From felicity@apache.org  2006-12-08 12:18 -------
(In reply to comment #25)
> I voted on 11/13, so that's two. Does it need two or three?

Thanks for voting, however only votes cast by committers counts towards the
necessary 3 to apply the patch.  It's good to find out from others that the
patch fixes the problem, etc, though. :)

I think this patch is fine though, so +1. :)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From achowe@snert.com  2006-11-12 12:37 -------
Consider also MIME encoding and DKIM. Each MIME part uses a CRLF to separate the
MIME headers from the MIME part. If the CRLF were part of the MIME part then
MIME processing of pure unencoded binary parts would be in error because of the
prefixed CRLF. How does DKIM handles such a message?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


Mark.Martinec@ijs.si changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Mark.Martinec@ijs.si




------- Additional Comments From Mark.Martinec@ijs.si  2007-02-14 16:48 -------
Sorry for coming late for lunch again.

> $line =~ s/$/\r\n/s;

Well, the Mail::DKIM documentation clearly states that
line endings must be \015 \012, and reinforces it by an example:

  # use SMTP line terminators
  $dkim->PRINT("$_\015\012");

The exact character codes are important because they enter
as data into crypto hashing.

People used to Perl on Unix platforms (including myself)
often forget the semantics that Perl gives to \r and \n.
It is explained in the perlport man page. Here is an
important excerpt:


  A common misconception ... is that \n eq \012 everywhere.
  When using protocols such as common Internet protocols,
  \012 and \015 are called for specifically, and the values
  of the logical \n and \r (carriage return) are not reliable.
  [...]
             | Unix | DOS  | Mac  |
        ---------------------------
        \n   |  LF  |  LF  |  CR  |
        \r   |  CR  |  CR  |  LF  |
        \n * |  LF  | CRLF |  CR  |
        \r * |  CR  |  CR  |  LF  |
        ---------------------------
        * text-mode STDIO

That conversion section calling $dkim->PRINT should be modified
to always append \015\012 to lines (instead of \r\n), to insure
the code will also work on DOS and Mac.

Could the ticket be reopened, or should I enter the above as
a new entry?

  Mark



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 20:50 -------
I looked at the return from get_pristine_header() under three circumstances,

1) The header and body line endings are all \n

2) The header line endings are \r\n, the body are \n, and the separator is \n

3) The header line endings are \r\n, the body are \n, and the separator is \r\n

In case 3 only, get_pristine_header() returns an extra element at the end
consisting of the string '\r'.

I think that has to be considered a bug.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From jm@jmason.org  2006-11-13 01:44 -------
whatever we do here, we should add a test case to the Spamassassin test suite
for it.  that will allow us to catch regressions in the DKIM modules as well as
in SA itself in future, and provide working test data in case we need to
demonstrate that some *other* DKIM implementation is doing it wrong ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 03:53 -------
achowe's response:

This is a variant of what I mentioned previously about Unix newlines found in
saved files vs. the newlines used by RFC 2822. This is an artifact of how lines
are often read in stripping CRLF then writen to a file adding back a LF. This is
a common mistake by mail app. implementors who might see it as unimportant (I'm
not referring to SA dev here, just history). As an aside Mark Crispin, author of
UW-IMAP, said this caused so much problems that newer versions of his IMAP
software now always save the mail to the mailbox folder using CRLF.

The SA spamd protocol document from 2.5 (the original document used when
milter-spamc was first written) did not specify whether the client
communications should use CRLF or just LF on the protocol's own headers; only
the end of header mark. But there is no mention how the mail headers and content
should be sent to spamd, therefore the assumption has always been "as seen off
the wire".

Later versions of the spamd protocol document from 3.0, and 3.1 are a little
more clear concerning the newlines used for spamc client headers, but get
wishy-washy about the spamd response headers varying between CRLF and LF. And in
neither document do they state what form the mail content passed should take,
ie. "as seen off the wire" or normalised to using CRLF or LF or hell why not
Berstien Strings (and avoid CRLF v LF issues).

Again I stuck with my original choice of maintaining RFC 2822 newlines, since
this avoids unnecessary translation, is consist with mail protocol standards,
allows for saving the message in form that could later be reintroduced into the
mail system, and has worked with SA up until DKIM.

I would suggest that SA update the spamd protocol document to be more precise as
to what it wants to see at every stage of the protocol right down to newline
format as this would aid implementors.

Its not a mistake to assume RFC 2822 line endings, its the standard. That other
mail MUA/MTA developers have choosen to be careless with it such that we have to
dumb down our products for the mistakes of others.

I've considered doing as SA suggests, using some limited look ahead in the first
body chunk to determine newline type, but the milter API is linear such that
this information comes after the headers have been given to the milter, sans
CRLF or exact white space between heder-colon and the value, already be placed
in a buffer using CRLF. It gets messy having to hold that information until the
body chunks arrive.
It feels inherently wrong.

I would like to know why the CRLF header separator is treated as part of the
message body by SA and not the header section? I send all the message headers
using CRLF and the separator as CRLF, then I send the message body chunks
exactly as sendmail provided them to the milter, see milter API doc for
xxfi_body hook:

http://www.milter.org/milter_api/xxfi_body.html

It states the body chunks _should_ have RFC 2822 CRLF newlines, though it may
have arrived as LF (grr).

Doing as SA suggests, using the newlines as found in the message body, will
break one day when some poorly written mail app. send headers & separator with
CRLF and a message body using LF or worse visa versa headers with LF and body
with CRLF.

Essentially to avoid the newline issue, the DKIM spec and their implementations
should
not be signing newlines.

---

I would argue that SpamAssassin should correct their implementation to use two
different newlines types, those of the headers and separator, followed by those
for the body after the header section and CRLF.

---

I'd also be wondering how SpamAssassin CLI handles DKIM on Windows where their
newlines are CRLF.

So many issues make my head spin.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 04:54 -------
Created an attachment (id=3745)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3745&action=view)
Packet captures before and after Ben Lentz's patch

I missed attaching the packet capture that Ban Lentz took. I'm not including
other  information he sent to the dev list as I don't think it is necessary.
They are in the dev mailing list archives.

This file is described by Ben as follows:

"packet-captures.tar.gz:
two traces, one with the original milter-spamc 1.10.376 release, the other with
milter-spamc 1.10.376 with my patch. spamd on TCP783, milter-spamc on TCP3336.
These should be independent captures of the sendmail to milter-spamc
communication, the mitler-spamc to spamd communication, the spamd to
milter-spamc response, and the mitler-spamc to sendmail response."

What I noticed is that the unpatched version has /r/n newlines in the headers,
followed by /r/n for the blank line separating header and body, followed by a
body that has /n newlines.

The Ben's patched version, which spamd does handle, is the same except that the
blank line separating the header and the body has a /n newline.

However this is happening, it seems to me that if we try to handle headers with
/r/n newlines and bodies with /n newlines, then we should handle the separator
line being either like the header or like the body.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179

------- Additional Comments From blentz@channing-bete.com 2006-11-12 15:36 -------
FWIW:

This may or may not pertain, as the subject of "horizontal" whitespace versus
"vertical" whitespace has been mentioned... I don't want to confuse the issue
regarding the specific header/body separator issue presented here, but the
DomainKey and DKIM standards have had issues in the past with MTA munge
(sendmail), where trailing whitespace in the header portion of an email is
modified, causing the signatures to fail.

http://sourceforge.net/tracker/index.php?func=detail&aid=1485150&group_id=110311&atid=656974

It may be noteworthy that my testing has been specific to DKIM running in
relaxed/simple mode (relaxed for header canonicalization due to sendmail munge,
and simple for body canonicalization). I've chosen this method (relaxed/simple
versus simple/simple) over using _FFR_ANTICIPATE_SENDMAIL_MUNGE since it's my
understanding that relaxed canonicalization is "production-ready" code, instead
of being for future release.

If anyone feels it'd be valuable for me to put my signing systems in
simple/simple with _FFR_ANTICIPATE_SENDMAIL_MUNGE to see what effect it has on
the header/body separator issue illustrated here, I'd be happy to do so. It may
be of limited value, though, as the reflector hosted at
autorespond+dkim@dk.elandsys.com does DKIM in simple/simple mode and it verifies
fine with my hack to milter-spamc (and doesn't without it). The DomainKey from
the same reflector doesn't verify for me, but only because it wasn't signed with
h= style DomainKey headers and because I add a header to the email before it
hits SpamAssassin - this is my fault. The DKIM standard uses h= style signatures
by default, so header additions in the chain don't break the signatures.

Also noteworthy is that DK_VERIFIED otherwise works great... I am using nofws
mode to sign my messages with DomainKey and sendmail + milter-spamc + SA +
Mail::DomainKeys has never had any trouble on the verification end. This could
be due, however, to the idea that the header/body separator is not part of the
DomainKey canonicalization.

As far as MIME goes, it's never been an issue. I purposely test signatures with
HTML emails to "stress-test" the system into parsing all that extra body data...
and it's never been a problem. I'm guessing because the MIME data goes beyond
the header/body separator it's a non-issue (at least for my purposes).

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From jm@jmason.org  2007-02-16 04:11 -------
those are already gone in 3.2.0 -- I think you sent us the patch ;)
true that it needs a different fix in 3.1.x though.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From blentz@channing-bete.com  2006-11-13 04:37 -------
I've confirmed that Sidney's latest patch corrects this issue.

I've reverted back to the original milter-spamc-1.10.376 release, with *no*
modifications, applied Sidney's patch to spamassassin and have run through
several test messages both from systems I host and external systems, and in each
case I can consistently tag DKIM_VERIFIED for DKIM signed messages.

+1, no doubt.

FWIW in draft-ietf-dkim-base-06, section 3.4:
>   To satisfy all requirements, two canonicalization algorithms are
>   defined for each of the header and the body:  a "simple" algorithm
>   that tolerates almost no modification and a "relaxed" algorithm that
>   tolerates common modifications such as white-space replacement and
>   header field line re-wrapping.  A signer MAY specify either algorithm
>   for header or body when signing an email.  If no canonicalization
>   algorithm is specified by the signer, the "simple" algorithm defaults
>   for both header and body.  Verifiers MUST implement both
>   canonicalization algorithms.  Note that the header and body may use
>   different canonicalization algorithms.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


sidney@sidney.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #3747 is|0                           |1
           obsolete|                            |




------- Additional Comments From sidney@sidney.com  2006-11-13 01:44 -------
Created an attachment (id=3748)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3748&action=view)
Corrected patch as per comment 17 and added some comments to code

Thanks, Anthony, I agree. Given that the fix strips all line endings, your
syntax more clearly states what is going on.

Committed to trunk revision 474216.

The patch can be applied as is to the 3.1 branch.

Please vote for committing to 3.1




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From achowe@snert.com  2006-11-12 12:48 -------
Earlier today I had a look at parts of DKIM finally 

http://www.ietf.org/internet-drafts/draft-ietf-dkim-base-06.txt

Section 3.4 Canonicalization is of particular interest talking of how the
handling of white space is to conducted with "simple" and "relaxed" algorithms.
The last two paragraphs of 3.4 is of particular note worthy:

   Canonicalization simply prepares the email for presentation to the
   signing or verification algorithm.  It MUST NOT change the
   transmitted data in any way.  Canonicalization of header fields and
   body are described below.

   NOTE:  This section assumes that the message is already in "network
   normal" format (e.g., text is ASCII encoded, lines are separated with
   CRLF characters, etc.).  See also Section 5.3 for information about
   normalizing the message.

I understand "network normal" to mean the message in its RFC 2822 form,
regardless of how it was received, is submitted to the signing/verifier subject
to one of the two white space algorithms.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 03:42 -------
Ben Lentz wrote to the dev mailing list:

Greetings,
I am currently using spamassassin 3.1.7 with SnertSoft's milter-spamc 1.10.376
and sendmail 8.13.8. I am having problems getting DKIM_VERIFIED to work, and
after some trial and error, I compared the canonicalized files from a signer and
verifier system, and found that the data being passed to SpamAssassin (and in
turn to Plugin/DKIM.pm and Mail::DKIM) contained what appeared to be an extra
CRLF between the headers and body.

This is almost a "normal" problem with DKIM, as it's sensitive (by design) to
signs of tampering like this.

Applying a patch against the source for milter-spamc, removing what I believe is
the line of code that's injecting this CRLF when the data is passed from
libmitler to spamassassin:

--- milter-spamc.c.orig 2006-10-17 17:07:09.000000000 -0400
+++ milter-spamc.c 2006-10-17 17:26:22.000000000 -0400
@@ -649,7 +649,7 @@
if (data->work.skipMessage)
return SMFIS_CONTINUE;

- (void) BufAddBytes(data->headers, "\r\n", 0, 2);
+ /* (void) BufAddBytes(data->headers, "\r\n", 0, 2); */

/* Insert a simulated Received: header for this server into the
* header block being sent to spamd. It appears that this header

Results in SpamAssassin doing this:
X-Spam-Report: Content preview: test [...]
____
Content analysis details: (-0.1 points, 4.0 required)
____
pts rule name description
---- ---------------------- --------------------------------------------------
...snip...
2.5 MISSING_HB_SEP Missing blank line between message header and body
0.0 DKIM_POLICY_SIGNALL Domain Keys Identified Mail: policy says domain
signs all mails
-0.0 DKIM_VERIFIED Domain Keys Identified Mail: signature passes
verification
0.0 DKIM_SIGNED Domain Keys Identified Mail: message has a signature
0.0 DKIM_POLICY_TESTING Domain Keys Identified Mail: policy says domain
is testing DK

Which is a little unexpected... MISSING_HB_SEP & DKIM_VERIFIED at the same time.

However, I've also discovered that if I keep this line, but change the CRLF to a LF:

--- com/snert/src/milter-spamc/milter-spamc.c.orig    2006-10-18
15:03:37.000000000 -0400
+++ com/snert/src/milter-spamc/milter-spamc.c    2006-10-18 15:03:49.000000000 -0400
@@ -649,7 +649,7 @@
    if (data->work.skipMessage)
        return SMFIS_CONTINUE;

-    (void) BufAddBytes(data->headers, "\r\n", 0, 2);
+    (void) BufAddBytes(data->headers, "\n", 0, 1);

    /* Insert a simulated Received: header for this server into the
     * header block being sent to spamd. It appears that this header

Everything appears to work, MISSING_HB_SEP goes away, and DKIM_VERIFIED works
for signed mail.

However, after providing all this information to Anthony Howe, developer of
milter-spamc he's responded with:
> I'm going to reject this patch on the grounds that I claim the DKIM test in
SpamAssassin is wrong. RFC 2822 line endings for ALL headers, body lines, and
the blank line separating the two are CRLF, not LF. Since the first call to
filterBody() when processing a message does NOT contain the CRLF that separates
the headers from the body, that CRLF is correctly added to the buffer when
filterEndHeaders() is called. Its SpamAssassin that would appear not to be
correctly handling message newlines breaking on LF instead of CRLF.
The part of his theory that doesn't sit right with me is that if I re-scan the
email that's handed back to sendmail after milter-spamc verifies it,
DKIM_VERIFIED works fine. It appears that whatever milter-spamc is handing to
SpamAssassin won't DKIM_VERIFy, but that the final output of milter-spamc will
DKIM_VERIFy. To me, this is an indication of munge when handing off to
SpamAssassin, and a removal of that munge when handed back to sendmail. I know
I've just repeated the same point three times, but I'm trying to articulate it
correctly.

I'd like to help debug and/or troubleshoot this issue, even though I feel my
patch to milter-spamc resolves the issue for my systems.

Any thoughts would be greatly appreciated. If you feel it'd be more appropriate
to post to the users list, I'll do that instead, but I thought that this issue
might be more dev-y.

Thanks in advance.







------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 03:47 -------
I suggest capturing the raw bytes that SpamAssassin sees, and post
them here to be reproduced.

--j.

Ben Lentz writes:
> > Nope, the code in the milter appears to use \r\n only.
> > 
> > So, it's entirely possible that the messages I'm trying to verify are 
> > sent using \n, the milter changes them all to \r\n when it passes to 
> > SpamAssassin, and the DKIM checks fail.
>> > > SpamAssassin will use *either* \r\n or \n, depending on what's passed
>> > > to it -- but it has to be consistent.  If the first header line (iirc)
>> > > contains \n, it'll require that throughout.  Is the milter using
>> > > \r\n in some parts and \n elsewhere?
>> > >
>> > > --j.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


sidney@sidney.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|3.1.7                       |3.1.8




------- Additional Comments From sidney@sidney.com  2006-11-13 11:08 -------
correcting milestone - clicked on wrong one




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


blentz@channing-bete.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |blentz@channing-bete.com






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From achowe@snert.com  2006-11-12 23:24 -------
Looking at the patch, the line split fix appears to be the good answer, but I
have a small concern about the line that follows after it which appear to insert
the canonical CRLF at the end of the line for the verifier:

+  foreach my $line (split(/\r?\n/s, $header)) {
     $line =~ s/\r?$/\r\n/s;         # ensure \r\n ending

Given the split pattern will remove LF or CRLF newlines, you want the following
line to insert the canonical CRLF newline:

+     $line =~ s/$/\r\n/s;         # ensure \r\n ending








------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-14 14:19 -------
Justin said in comment 19: "whatever we do here, we should add a test case to
the Spamassassin test suite for it."

After finding that dkim.org has some test messages, I found that we already have
test cases to use them in trunk in t/dkim.t

Those tests are all disabled with a comment that it is because we do not enable
the DKIM plugin by default. Unless we set up a general mechanism for optional
testing of plugins, I think we should just enable the test. It already is
conditional on Mail::DKIM and its dependencies being installed, as well as
requiring network tests enabled, and I think that is good enough criteria for
using it in make test even if the plugin is not enabled for the SpamAssassin
installation.

However, there is a comment in the tests about a TODO that we should not be
dependent on a third party DNS server for our tests. These check the signatures
using a record in a subdomain of dkim.org. As if to highlight the potential
problem, the tests worked for me yesterday then stopped working this morning
because the DNS server that is listed as handling that subdomain has disappeared
from DNS. Perhaps that is temporary, but if our tests are going to fail when a
system has problems, it should our own systems that are doing the failing.

So I don't want to add tests for this fix and enable t/dkim.t until we have
added a test DKIM public key to our own test domain DNS records.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 21:22 -------
Created an attachment (id=3747)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3747&action=view)
One line patch that fixes this

I have to run off before checking this in, so I'm uploading thts one line patch
instead of possibly messing something up being in a hurry.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 03:55 -------
More response from achowe:

WRT to the spamd protocol, generally its fine; its the documentation that needs
more precision so as to restrict implementations to a well defined model. I
think part of my problem here is the loose model I find in SA when combined with
things like DKIM need very exacting definitions and process, else simple things
like white space cause havoc. Please note that I've NOT read the DKIM spec. to
this date, though Eric Allman did provide me an overview last year (summer
2005), so my approach and arguments are from existing RFC standards, primarily
2821 and 2822.

> I didn't realize that we are in effect treating the newline separator
> between headers and body as part of the body when they use different
> newlines. That may be a flat out bug, and we should at least explore the
> implications of doing it one way vs the other and make an explicit decision.

>From what I have observed and based on Ben's trouble ticket comments, I'm
deducing that SA+DKIM relies on either:

a) The first LF or CRLF found in a file given to the CLI or daemon.

b) Take a sample of the first N lines ending in LF or CRLF and use the most
popular. However, you have to be sure to sample BOTH header and body separately
I think, since there may arise cases where header and body differ.

c) Use the newline that acts as the header/body separator as your newline.
However, the header/body separator can not be consider part of the message body
(it never displays), its more closely tied to the headers, so using it as an
indicator of newline style could be problematic- consider an app. that spits out
RFC 2822 headers with CRLF newlines, but read/includes a unix file containing LF
newlines for the body. The CRLF separator would be RFC compliant while the body
would not; very similar to milter-spamc.

d) Find the first blank line, ie. LF-LF or CRLF-CRLF as an indicator of line
style. If using this method, you must not count the header/body separator that
would be the first blank line found. My comments in c) about mixed sources ie.
careless file inclusion apply.

My thinking is that case c) or d) apply to SA, because when Ben modifies the
milter and alters the header/body separator from CRLF to LF, some of his
messages he processes pass SA+DKIM.

If case a) were true, then my passing of headers with CRLF to SA would have
affected the DKIM result such that Ben's mod would have failed. Now from my
limited knowledge, DKIM is suppose to ignore (or has an option to ignore)
whitespace, which I assume also includes vertical whitespace, not just
horizontal whitespace.

Case b) might apply if SA+DKIM is sampling newlines from both header and body
and finding that the body newlines out weigh the header, which is to be expected
but a skewed assumption.

Some refresher/background on the milter API; a copy of docs can be found here:
http://www.milter.org/milter_api/api.html

The milter API is essentially linear in application, with call-backs to handlers
made in step with the SMTP protocol up until DATA. From my understanding,
sendmail saves the DATA content to a temp/queue file before proceeding with the
latter half of the milter handlers, xxfi_header, xxfi_eof, xxfi_body, xxfi_eom.

The xxfi_header callback passes an already parsed and split header as name and
value, so the milter has no idea of the original form concering white space and
newline termination. Consider:

header:           lots of leading white space
    continued second line

The milter will see via handler    xxfi_header(ctx, name, value):

    "header"
    "lots of leading white space continued second line"

Or the value might also be:

    "lots of leading white space    continued second line"

but I've never verified how sendmail handles unfolding long headers in my milter
work.

In either case, my milter has to rejoin the header and pass it to SpamAssassin.
As you can imagine, if DKIM is whitespace sensitive, then the potential for
failures is huge. I could send

    "header: lots of leading white space continued second line\r\n"

    "header:lots of leading white space continued second line\r\n"

    "header:    lots of leading white space continued second
     line\r\n"

since RFC 2822 allows for zero or more white space after the header colon,
though a single space is recommended. However, even with the above rejoining,
this does not match what sendmail might have seen off the wire from a remote
MTA/MUA.

The xxfi_body handler simply receives unprocessed (64K) chunks of the message
body from sendmail. I'm assuming sendmail has been careful to avoid any CRLF to
LF changes that might occur when saving the message to a temp/queue file.
milter-spamc just passes the body chunks "as is" to spamd.

Now Ben's change involves just removing the CR from the CRLF header/boundary
separator such that my reconstructed RFC 2822 compliant headers appear to work
just fine with DKIM, which would indicate that the DKIM implementation ignores
all white space and newlines in the headers, BUT fails to include the
header/body separator as this was all that was changed by Ben's patch.

So my belief has been that there is something wrong in SA+DKIM if the simple
change in the header/body separator influences the result of the DKIM test and
so my statement that the separator is being incorrectly assumed as part of the
message body rather than a protocol element to be excluded.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 12:08 -------
I found a simple test case that has nothing to do with spamc-milter that
illustrates what is going on. Whether or not it is a bug is another question.

If you send mail to autorespond+dkim@dk.elandsys.com you will receive a reply
that  has a DKIM signature and which triggers the DKIM_VERIFIED rule.

If you save that message as a test case and edit it to make all the line endings
in the header /r/n and in the body /n, then if the blank line separator between
header and body is /r/n, DKIM_VERIFIER fails, while if it is /n then
DKIM_VERIFIER hits.

Note that other than the DKIM plugin, SpamAssassin seems to produce the same
results either way, with the output having all newlines in header and body
converted to be /r/n. When the header and body newlines are all /n,
DKIM_VERIFIER works and the output from spamassassin has all /n newlines.
 



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] [review] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179


sidney@sidney.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Ambiguity in spamd protocol |[review] Ambiguity in spamd
                   |for line endings can break  |protocol for line endings
                   |DKIM_VERIFIED test          |can break DKIM_VERIFIED test
  Status Whiteboard|                            |needs 2 votes for 3.1 branch
   Target Milestone|Undefined                   |3.1.7




------- Additional Comments From sidney@sidney.com  2006-11-13 01:50 -------
I'm already working on test cases based on the tests I ran while doing this bug.
I want to remove some personal email address and host name data before adding
them to the test directory, which is a bit tricky given that changing certain
things invalidates the signature.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5179] Ambiguity in spamd protocol for line endings can break DKIM_VERIFIED test

Posted by bu...@bugzilla.spamassassin.org.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5179





------- Additional Comments From sidney@sidney.com  2006-11-12 03:52 -------
My reply to that was:

I think that both problems - how to fix it and how to convince Anthony -
are simplified by one big difference between SpamAssassin and
milter-spamc. We have to handle mail that may be extracted at any stage
of going from one MUA to an MTA to another MTA to another MUA, on any
platform. We make some assumptions about compliance to RFC2822, but that
only works to the degree that various MUAs and MTAs happen to use
something close to RFC2822 for their local format. If that means that we
have uncertainty about the newline being used in any given installation,
we have to deal with that when adding headers. That's why we can't tell
what newline to use without looking at some of the newlines that are
already there.

On the other hand, milter-spamc is a sendmail milter, which means it
only has to deal with the realities of newlines in the headers that are
given to milters by sendmail, and can assume a unix-like platform.

I notice that documentation for the sendmail milter API says for the
function smfi_addheader in
http://www.sendmail.org/doc/sendmail-current/libmilter/docs/smfi_addheader.html

 To make a multi-line header, insert a line feed (ASCII 0x0a,
 or \n in C) followed by at least one whitespace character such
 as a space (ASCII 0x20) or tab (ASCII 0x09, or \t in C). The
 line feed should NOT be preceded by a carriage return (ASCII 0x0d);
 the MTA will add this automatically.

That's just an indication, but here is what I think the strongest
argument for Anthony is: Take a look at the newlines in the headers that
milter-spamc is receiving from sendmail. If they are unix-style
newlines, then that is proof that sendmail is using unix newlines
instead of RFC2822 newlines at that stage of the processing, and you
have to do the same to keep the results consistent.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.