You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Ring, John C" <jc...@switch.com> on 2005/04/29 22:37:20 UTC

INVALID_MSGID hitting improperly?

I just learned of an issue we're having on a fail positive due to a hit on
INVALID_MSGID (and that I'd jacked the score on that up to 20, but that's
another story...).  While I just learned of the issue today, it started a
bit ago for this sender.  Looking in the logs, I see the last message we
received from them where the INVALID_MSGID rule was NOT hitting showed:

2005-04-05 12:52:18 1DIrHM-0004AH-Tp <= Cam.Dowlat@nexans.com
H=(net1xans.agns.fr) [195.75.30.70] P=esmtp S=643141
id="/GUID:QPywoUg6DZ06+yvqCupCVJw*/G=Cam/S=Dowlat/OU=Corporate-Markham/O=Alc
atel Cable/PRMD=ACAB/ADMD=ATTMAIL/C=CA/"@MHS

But then it seems they changed the syntax their system uses to generate the
Message-ID header so that now the INVALID_MSGID rule is matching.  The logs
showed:

2005-04-08 10:31:04 1DJuVf-0006hu-Sv SA: Action: permanently rejected
message: score=21.1 required=5.0 trigger=10.0 (scanned in 4/4 secs |
Message-Id:
"-GUID:QnGodydG460CKmx35BCOvbw*-G=Cam-S=Dowlat-OU=Corporate-Markham-O=Alcate
l Cable-PRMD=ACAB-ADMD=ATTMAIL-C=CA-"@MHS). From <Ca...@nexans.com>
(host=NULL [195.75.30.71]) for oneofmyusers@switch.com

[A different log file showed that the INVALID_MSGID was being hit for the
rule in question.]

So, looking at:

"/GUID:QPywoUg6DZ06+yvqCupCVJw*/G=Cam/S=Dowlat/OU=Corporate-Markham/O=Alcate
l Cable/PRMD=ACAB/ADMD=ATTMAIL/C=CA/"@MHS

"-GUID:QnGodydG460CKmx35BCOvbw*-G=Cam-S=Dowlat-OU=Corporate-Markham-O=Alcate
l Cable-PRMD=ACAB-ADMD=ATTMAIL-C=CA-"@MHS

Side-by-side, it seems[1] that the only substantial difference between them
is that they've replaced the "/" with "-".  So I'm not certain why, if the
1st is valid, why the 2nd one would not be considered valid as well?

The log entries shown occurred while we were still running 3.0.2.  I
upgraded to 3.0.3 earlier today[2], and the INVALID_MSGID is still firing
against this message-id format from this sender.

[1] I looked at the these log lines with hexdump -C, and at least in the
log, the "space" was not some non-ascii character, and the - was really a -.

[2] Since this issue seems unrelated to that, I can say we've had no
problems with it so far :)

Re: INVALID_MSGID hitting improperly?

Posted by Matt Kettler <mk...@evi-inc.com>.
Ring, John C wrote:

>So, looking at:
>
>"/GUID:QPywoUg6DZ06+yvqCupCVJw*/G=Cam/S=Dowlat/OU=Corporate-Markham/O=Alcate
>l Cable/PRMD=ACAB/ADMD=ATTMAIL/C=CA/"@MHS
>
>"-GUID:QnGodydG460CKmx35BCOvbw*-G=Cam-S=Dowlat-OU=Corporate-Markham-O=Alcate
>l Cable-PRMD=ACAB-ADMD=ATTMAIL-C=CA-"@MHS
>
>Side-by-side, it seems[1] that the only substantial difference between them
>is that they've replaced the "/" with "-".  So I'm not certain why, if the
>1st is valid, why the 2nd one would not be considered valid as well?
>  
>
Looking at the rule, I'm surprised they aren't BOTH declared invalid.

It would appear that the rule should declare any Message-ID containing a
space to be invalid, unless in contains a parenthesized comment. Both
Message-ID's contain spaces, neither appears to have a comment.

Looking at RFC2822, it seems to be true that both are invalid. The left
half can be quoted, but the quoted text is not allowed to contain
whitespace.

Snipping the relevant parts from http://www.faqs.org/rfcs/rfc2822.html
in order to look at the construction of the left-half of a message ID:

message-id      =       "Message-ID:" msg-id CRLF

msg-id          =       [CFWS] "<" id-left "@" id-right ">" [CFWS]

id-left         =       dot-atom-text / no-fold-quote / obs-id-left

no-fold-quote   =       DQUOTE *(qtext / quoted-pair) DQUOTE

qtext           =       NO-WS-CTL /     ; Non white space controls

                        %d33 /          ; The rest of the US-ASCII
                        %d35-91 /       ;  characters not including "\"
                        %d93-126        ;  or the quote character

quoted-pair     =       ("\" text) / obs-qp



There's no allowance in there for id-left to contain a space.
dot-atom-text and obs-id-left can't contain a quote character, so I've
not bothered to put their definitions in here.

Re: INVALID_MSGID hitting improperly?

Posted by Loren Wilton <lw...@earthlink.net>.
> | If a rule doesn't FP for you, then you can set it to 20.
>
> Sure, you *can*, but why would you want to score a single rule at 20?
> Especially one like this which, as we've just seen, can produce FP's.

Well, as an example I have a couple of rules that *can* (and for others,
outright *would*) FP, scored at 10.  They make my spam scores nice and high,
and I will be **VERY** surprised if I ever get an FP on one of them.  One of
these rules simply checks for 8 or more recipients on the same domain as I
am.  From my POV, the chances of me getting a valid non-spam that is copied
to 8 other people on eathlink explicitly is zero.  I don't *know* 8 other
people on earthlink!

Of course, if I ever do get an FP, I'll lower the score.  Until then though,
it gives me pleasure to watch all those spams trying for 100+ points each...

        Loren


Re: INVALID_MSGID hitting improperly?

Posted by Craig McLean <cr...@craig.dnsalias.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Theo Van Dinter wrote:
| On Fri, Apr 29, 2005 at 11:14:10PM +0100, Craig McLean wrote:
|
|>| BTW, why have *any* single rule scored at 20? Especially this one.
|>
|>This question, however, still stands.
|
|
| If a rule doesn't FP for you, then you can set it to 20.

Sure, you *can*, but why would you want to score a single rule at 20?
Especially one like this which, as we've just seen, can produce FP's.

| In general, people tend to be a little more conservative and expect
FPs, but it's
| really up to your level of comfort and what type of mail you receive.

I believe it's got to be better to have many low-scoring rules than one
single rule that scores 300% higher than your SA threshold. But, as you
rightly say, to each their own...


Kind Regards,
Craig.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFCcr4HMDDagS2VwJ4RAk19AJ4p2n14A1n4bDn0kPi+ZvsKaQyMqQCg+e89
j+HyUdVEPLqx2TZRKFkutJU=
=OgX/
-----END PGP SIGNATURE-----

Re: INVALID_MSGID hitting improperly?

Posted by Theo Van Dinter <fe...@kluge.net>.
On Fri, Apr 29, 2005 at 11:14:10PM +0100, Craig McLean wrote:
> | BTW, why have *any* single rule scored at 20? Especially this one.
> 
> This question, however, still stands.

If a rule doesn't FP for you, then you can set it to 20.  In general,
people tend to be a little more conservative and expect FPs, but it's
really up to your level of comfort and what type of mail you receive.

-- 
Randomly Generated Tagline:
"Coffee anyone?
  No thanks, one more cup and I'll jump into warp." - Capt. Janeway (ST:Voyager)

Re: INVALID_MSGID hitting improperly?

Posted by Craig McLean <cr...@craig.dnsalias.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In reply to my own earlier post:

Thanks to Matt Kettler for a better understanding of the facts. I should
have RTFRFC again before opening my mouth!

| They both seem to hit INVALID_MSGID here.

As they should, see below.

| I'm having some problems understanding why, it seems to be the space in
| "Alcatel Cable" as mandated by __SANE_MSGID (which I believe is not
| against RFC2822, as stated, provided it is in a quoted string).

Ok, I take the comment in parenthesis back ;-)

|
| BTW, why have *any* single rule scored at 20? Especially this one.
|

This question, however, still stands.

Regards,
Craig.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFCcrGwMDDagS2VwJ4RAqE6AJ9Rf9NwAZAqu0puwwki4ps52j7xogCaAqy2
cZdJXxC16uzfmjXcat8f65I=
=KJL+
-----END PGP SIGNATURE-----

Re: INVALID_MSGID hitting improperly?

Posted by Craig McLean <cr...@craig.dnsalias.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ring, John C wrote:
| I just learned of an issue we're having on a fail positive due to a hit on
| INVALID_MSGID (and that I'd jacked the score on that up to 20, but that's
| another story...).  While I just learned of the issue today, it started a
| bit ago for this sender.  Looking in the logs, I see the last message we
| received from them where the INVALID_MSGID rule was NOT hitting showed:
[snip]
| So, looking at:
|
|
"/GUID:QPywoUg6DZ06+yvqCupCVJw*/G=Cam/S=Dowlat/OU=Corporate-Markham/O=Alcate
| l Cable/PRMD=ACAB/ADMD=ATTMAIL/C=CA/"@MHS
|
|
"-GUID:QnGodydG460CKmx35BCOvbw*-G=Cam-S=Dowlat-OU=Corporate-Markham-O=Alcate
| l Cable-PRMD=ACAB-ADMD=ATTMAIL-C=CA-"@MHS
|
| Side-by-side, it seems[1] that the only substantial difference between
them
| is that they've replaced the "/" with "-".  So I'm not certain why, if the
| 1st is valid, why the 2nd one would not be considered valid as well?

They both seem to hit INVALID_MSGID here.
I'm having some problems understanding why, it seems to be the space in
"Alcatel Cable" as mandated by __SANE_MSGID (which I believe is not
against RFC2822, as stated, provided it is in a quoted string). It would
be interesting to see the full headers of the message that hit this rule.

BTW, why have *any* single rule scored at 20? Especially this one.

Craig.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFCcq7mMDDagS2VwJ4RAt2HAJ90DPerqRK1svv4hRYQmibyqFTxPwCgsXLv
leuuAl6eG9xgM+p7IDFxqcA=
=Tpi5
-----END PGP SIGNATURE-----