You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2007/08/16 11:15:03 UTC

Re: PDF rule not matching -- split line content type?

Jo --

I've checked that in as 'TVD_PDF_FINGER01_JO'.  You can track its progress
at http://ruleqa.SpamAssassin.org .

by the way -- it's pretty easy for you to test your own rules in your own
environment, actually, and I recommend you try it out.  These are the
tools we use:

  http://wiki.apache.org/spamassassin/MassCheck
  http://wiki.apache.org/spamassassin/HitFrequencies

They are bundled with SpamAssassin in the "masses" folder.  All the
documentation is there on the wiki.

--j.

Jo Rhett writes:
> Since nobody is paying attention, let me clarify.  The current rule is 
> wrong:
> 
> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
> /^application\/octet-stream.*\.pdf/i
> 
> meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT_TP && 
> __TVD_MIME_ATT && !__TVD_BODY
> 
> This evaluates to exactly the same as this:
> 
> meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT_TP && !__TVD_BODY
> 
> I believe that the original rule's intent was this:
> 
> meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT && !__TVD_BODY
> 
> Can someone with commit rights please test and commit this change?
> Thank you.
> 
> Jo Rhett wrote:
> > Well actually I think the rule has a bug.  Why OR the two mime types as 
> > a new meta, and then require one of the two in the final meta?   The net 
> > effect is that if ATT_TP is true it matches, but if ATT_AOPDF is true it 
> > will never match.
> > 
> > I believe that the following will work better - work in every situation 
> > that it worked before, and not fail when the mime type is octet-stream:
> >    meta TVD_PDF_FINGER01           __TVD_MIME_CT_MM && __TVD_MIME_ATT && 
> > !__TVD_BODY
> > 
> > Would someone kindly evaluate this change and possibly fix the rule?  
> > Thanks.
> > 
> > On Aug 14, 2007, at 10:41 PM, Loren Wilton wrote:
> >>>> rawbody __TVD_BODY              /\S{4}/
> >>
> >> true
> >>
> >>>> header __TVD_MIME_CT_MM         Content-Type =~ /^multipart\/mixed/i
> >>
> >> true
> >>
> >>>> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
> >>
> >> false
> >>
> >>>> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
> >>>> /^application\/octet-stream.*\.pdf/i
> >>
> >> maybe true, maybe not.  I would hope newlines were translated to 
> >> spaces by the mimehdr plugin, but maybe they weren't.  Try /is instead 
> >> of /i and see if it helps.
> >>
> >>>> meta __TVD_MIME_ATT             __TVD_MIME_ATT_AP || 
> >>>> __TVD_MIME_ATT_AOPDF
> >>
> >> maybe true
> >>
> >>>> meta TVD_PDF_FINGER01
> >>    __TVD_MIME_CT_MM
> >> true
> >>    && __TVD_MIME_ATT_TP
> >> undefined here, can't say
> >>    && __TVD_MIME_ATT
> >> maybe true
> >>    && !__TVD_BODY
> >> true
> >>
> >> So, not knowing what is in __TVD_MIME_ATT_TP, I haven't a clue if it 
> >> will fire, since that is part of an 'and'.  If I assume it to be true 
> >> then I'm still not sure because of the multiline possibility in 
> >> __TVD_MIME_ATT.
> >>
> >>        Loren
> >>
> >>>> describe TVD_PDF_FINGER01       Mail matches standard pdf spam 
> >>>> fingerprint
> >>
> >>
> >> ----- Original Message ----- From: "Jo Rhett" <jr...@netconsonance.com>
> >> To: "SpamAssassin Users" <us...@spamassassin.apache.org>
> >> Sent: Tuesday, August 14, 2007 10:16 PM
> >> Subject: Re: PDF rule not matching -- split line content type?
> >>
> >>
> >>> Can someone clue me in on why this rule isn't matching?
> >>>
> >>> Jo Rhett wrote:
> >>>> So I've been getting a metric ton of PDF spam.  Investigating the 
> >>>> rule that is supposed to match this, I see
> >>>>
> >>>> rawbody __TVD_BODY              /\S{4}/
> >>>> header __TVD_MIME_CT_MM         Content-Type =~ /^multipart\/mixed/i
> >>>> meta __TVD_MIME_ATT             __TVD_MIME_ATT_AP || 
> >>>> __TVD_MIME_ATT_AOPDF
> >>>> meta TVD_PDF_FINGER01           __TVD_MIME_CT_MM && 
> >>>> __TVD_MIME_ATT_TP && __TVD_MIME_ATT && !__TVD_BODY
> >>>> describe TVD_PDF_FINGER01       Mail matches standard pdf spam 
> >>>> fingerprint
> >>>>
> >>>> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
> >>>> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
> >>>> /^application\/octet-stream.*\.pdf/i
> >>>>
> >>>> The following message appears to match perfectly with this, except 
> >>>> for perhaps that the content type is spread across two lines?  I 
> >>>> haven't checked the code, but would this matter?
> >>>>
> >>>> Return-Path: <Yo...@nic.za.net>
> >>>> Received: from mail.netconsonance.com ([unix socket])
> >>>>      by triceratops.netconsonance.com (Cyrus v2.3.8) with LMTPA;
> >>>>      Tue, 14 Aug 2007 06:27:16 -0700
> >>>> Received: from [84.21.29.58] ([84.21.29.58])
> >>>>     by mail.netconsonance.com (8.14.1/8.14.1) with ESMTP id 
> >>>> l7EDR4UU095951
> >>>>     for <jr...@lizardarts.com>; Tue, 14 Aug 2007 06:27:08 -0700 (PDT)
> >>>>     (envelope-from Yohann@nic.za.net)
> >>>> X-Virus-Scanned: amavisd-new at netconsonance.com
> >>>> X-Spam-Score: 2.033
> >>>> X-Spam-Level: **
> >>>> X-Spam-Status: No, score=2.033 tagged_above=-999 required=4
> >>>>     tests=[DK_POLICY_SIGNSOME=0.001, HTML_MESSAGE=0.001,
> >>>>     MIME_HTML_MOSTLY=0.699, RCVD_IN_BL_SPAMCOP_NET=1.332]
> >>>> Received: from x-6of7ca27m39al ([158.187.61.7]) by [84.21.29.58] 
> >>>> with Microsoft SMTPSVC(6.0.3790.1830);
> >>>>     Tue, 14 Aug 2007 15:27:01 +0200
> >>>> Message-ID: <00...@x6of7ca27m39al>
> >>>> From: "Yohann michels" <Yo...@nic.za.net>
> >>>> To: jrhett@lizardarts.com
> >>>> Subject: bill-jrhett
> >>>> Date: Tue, 14 Aug 2007 15:26:28 +0200
> >>>> MIME-Version: 1.0
> >>>> Content-Type: multipart/mixed;
> >>>>     boundary="----=_NextPart_000_000E_01C7DE87.7C1E24D0"
> >>>> X-Priority: 3
> >>>> X-MSMail-Priority: Normal
> >>>> X-Mailer: Microsoft Outlook Express 6.00.2900.3138
> >>>> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
> >>>>
> >>>>
> >>>> ------=_NextPart_000_000E_01C7DE87.7C1E24D0
> >>>> Content-Type: multipart/alternative;
> >>>>     boundary="----=_NextPart_001_000F_01C7DE87.7C1E24D0"
> >>>>
> >>>>
> >>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0
> >>>> Content-Transfer-Encoding: quoted-printable
> >>>> Content-Type: text/plain;
> >>>>     charset=windows-1250
> >>>>
> >>>>
> >>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0
> >>>> Content-Transfer-Encoding: quoted-printable
> >>>> Content-Type: text/html;
> >>>>     charset=windows-1250
> >>>>
> >>>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> >>>> <HTML><HEAD>
> >>>> <META http-equiv=3DContent-Type content=3D"text/html; =
> >>>> charset=3Dwindows-1250">
> >>>> <META content=3D"MSHTML 6.00.2900.3132" name=3DGENERATOR>
> >>>> <STYLE></STYLE>
> >>>> </HEAD>
> >>>> <BODY bgColor=3D#ffffff>
> >>>> <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV></BODY></HTML>
> >>>>
> >>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0--
> >>>>
> >>>> ------=_NextPart_000_000E_01C7DE87.7C1E24D0
> >>>> Content-Transfer-Encoding: base64
> >>>> Content-Type: application/octet-stream;
> >>>>     name=marketing-jrhett.pdf
> >>>> Content-Disposition: attachment;
> >>>>     filename=marketing-jrhett.pdf
> >>>>
> >>>> JVBERi0xLjUNJeLjz9MNCjIyIDAgb2JqPDwvSFs0MzYgMTQ4XS9MaW5lYXJpemVkIDEvRSAxNjU5 
> >>>> L0wgMTM1NzYvTiAxMC9PIDI2L1QgMTMwNzQ+Pg1lbmRvYmoNICAgICAgICAgICAgICAgICAgICAg 
> >>>> *snip*
> >>>>
> >>>>
> >>>
> >>>
> >>> -- 
> >>> Jo Rhett
> >>> Net Consonance ... net philanthropy, open source and other randomness
> >>
> >>
> > 
> 
> 
> -- 
> Jo Rhett
> Net Consonance ... net philanthropy, open source and other randomness

Re: PDF rule not matching -- split line content type?

Posted by Jo Rhett <jr...@netconsonance.com>.
Justin Mason wrote:
> I've checked that in as 'TVD_PDF_FINGER01_JO'.  You can track its progress
> at http://ruleqa.SpamAssassin.org .

Thank you.

> by the way -- it's pretty easy for you to test your own rules in your own
> environment, actually, and I recommend you try it out.

I do so for any rule I post about on the list.  But see the next below:

> These are the tools we use:
> 
>   http://wiki.apache.org/spamassassin/MassCheck
>   http://wiki.apache.org/spamassassin/HitFrequencies
> 
> They are bundled with SpamAssassin in the "masses" folder.  All the
> documentation is there on the wiki.

I know that.  But before a rule gets committed it goes through a testing 
against a much larger amount of spam/ham than I have available to me.

And I honestly don't have the time available to me to spend hours each 
day building spam/ham corpi.  It simply isn't going to happen.  So the 
tools aren't very useful as generalized testing tools in my environment.

-- 
Jo Rhett
Net Consonance ... net philanthropy, open source and other randomness

Re: PDF rule not matching -- split line content type?

Posted by Jo Rhett <jr...@netconsonance.com>.
Sorry, Theo -- I keep working on the rule without changing the subject 
line.  The rule does have a different problem as I detailed below.

I'm trying to find a useful way to provide a different rule that matches 
what you identified in that one, but I haven't had the time this week to 
work on it.  (and lack a good testing ground, as mentioned in the 
previous message)

Theo Van Dinter wrote:
> FWIW, I responded a few days ago with an explanation of why the rule isn't
> hitting.  It has nothing to do with content-type headers and everything to do
> with the fact that the message body isn't empty, there's HTML content.
> 
> 
> On Thu, Aug 16, 2007 at 10:15:03AM +0100, Justin Mason wrote:
>> Jo --
>>
>> I've checked that in as 'TVD_PDF_FINGER01_JO'.  You can track its progress
>> at http://ruleqa.SpamAssassin.org .
>>
>> by the way -- it's pretty easy for you to test your own rules in your own
>> environment, actually, and I recommend you try it out.  These are the
>> tools we use:
>>
>>   http://wiki.apache.org/spamassassin/MassCheck
>>   http://wiki.apache.org/spamassassin/HitFrequencies
>>
>> They are bundled with SpamAssassin in the "masses" folder.  All the
>> documentation is there on the wiki.
>>
>> --j.
>>
>> Jo Rhett writes:
>>> Since nobody is paying attention, let me clarify.  The current rule is 
>>> wrong:
>>>
>>> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
>>> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
>>> /^application\/octet-stream.*\.pdf/i
>>>
>>> meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT_TP && 
>>> __TVD_MIME_ATT && !__TVD_BODY
>>>
>>> This evaluates to exactly the same as this:
>>>
>>> meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT_TP && !__TVD_BODY
>>>
>>> I believe that the original rule's intent was this:
>>>
>>> meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT && !__TVD_BODY
>>>
>>> Can someone with commit rights please test and commit this change?
>>> Thank you.
>>>
>>> Jo Rhett wrote:
>>>> Well actually I think the rule has a bug.  Why OR the two mime types as 
>>>> a new meta, and then require one of the two in the final meta?   The net 
>>>> effect is that if ATT_TP is true it matches, but if ATT_AOPDF is true it 
>>>> will never match.
>>>>
>>>> I believe that the following will work better - work in every situation 
>>>> that it worked before, and not fail when the mime type is octet-stream:
>>>>    meta TVD_PDF_FINGER01           __TVD_MIME_CT_MM && __TVD_MIME_ATT && 
>>>> !__TVD_BODY
>>>>
>>>> Would someone kindly evaluate this change and possibly fix the rule?  
>>>> Thanks.
>>>>
>>>> On Aug 14, 2007, at 10:41 PM, Loren Wilton wrote:
>>>>>>> rawbody __TVD_BODY              /\S{4}/
>>>>> true
>>>>>
>>>>>>> header __TVD_MIME_CT_MM         Content-Type =~ /^multipart\/mixed/i
>>>>> true
>>>>>
>>>>>>> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
>>>>> false
>>>>>
>>>>>>> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
>>>>>>> /^application\/octet-stream.*\.pdf/i
>>>>> maybe true, maybe not.  I would hope newlines were translated to 
>>>>> spaces by the mimehdr plugin, but maybe they weren't.  Try /is instead 
>>>>> of /i and see if it helps.
>>>>>
>>>>>>> meta __TVD_MIME_ATT             __TVD_MIME_ATT_AP || 
>>>>>>> __TVD_MIME_ATT_AOPDF
>>>>> maybe true
>>>>>
>>>>>>> meta TVD_PDF_FINGER01
>>>>>    __TVD_MIME_CT_MM
>>>>> true
>>>>>    && __TVD_MIME_ATT_TP
>>>>> undefined here, can't say
>>>>>    && __TVD_MIME_ATT
>>>>> maybe true
>>>>>    && !__TVD_BODY
>>>>> true
>>>>>
>>>>> So, not knowing what is in __TVD_MIME_ATT_TP, I haven't a clue if it 
>>>>> will fire, since that is part of an 'and'.  If I assume it to be true 
>>>>> then I'm still not sure because of the multiline possibility in 
>>>>> __TVD_MIME_ATT.
>>>>>
>>>>>        Loren
>>>>>
>>>>>>> describe TVD_PDF_FINGER01       Mail matches standard pdf spam 
>>>>>>> fingerprint
>>>>>
>>>>> ----- Original Message ----- From: "Jo Rhett" <jr...@netconsonance.com>
>>>>> To: "SpamAssassin Users" <us...@spamassassin.apache.org>
>>>>> Sent: Tuesday, August 14, 2007 10:16 PM
>>>>> Subject: Re: PDF rule not matching -- split line content type?
>>>>>
>>>>>
>>>>>> Can someone clue me in on why this rule isn't matching?
>>>>>>
>>>>>> Jo Rhett wrote:
>>>>>>> So I've been getting a metric ton of PDF spam.  Investigating the 
>>>>>>> rule that is supposed to match this, I see
>>>>>>>
>>>>>>> rawbody __TVD_BODY              /\S{4}/
>>>>>>> header __TVD_MIME_CT_MM         Content-Type =~ /^multipart\/mixed/i
>>>>>>> meta __TVD_MIME_ATT             __TVD_MIME_ATT_AP || 
>>>>>>> __TVD_MIME_ATT_AOPDF
>>>>>>> meta TVD_PDF_FINGER01           __TVD_MIME_CT_MM && 
>>>>>>> __TVD_MIME_ATT_TP && __TVD_MIME_ATT && !__TVD_BODY
>>>>>>> describe TVD_PDF_FINGER01       Mail matches standard pdf spam 
>>>>>>> fingerprint
>>>>>>>
>>>>>>> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
>>>>>>> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
>>>>>>> /^application\/octet-stream.*\.pdf/i
>>>>>>>
>>>>>>> The following message appears to match perfectly with this, except 
>>>>>>> for perhaps that the content type is spread across two lines?  I 
>>>>>>> haven't checked the code, but would this matter?
>>>>>>>
>>>>>>> Return-Path: <Yo...@nic.za.net>
>>>>>>> Received: from mail.netconsonance.com ([unix socket])
>>>>>>>      by triceratops.netconsonance.com (Cyrus v2.3.8) with LMTPA;
>>>>>>>      Tue, 14 Aug 2007 06:27:16 -0700
>>>>>>> Received: from [84.21.29.58] ([84.21.29.58])
>>>>>>>     by mail.netconsonance.com (8.14.1/8.14.1) with ESMTP id 
>>>>>>> l7EDR4UU095951
>>>>>>>     for <jr...@lizardarts.com>; Tue, 14 Aug 2007 06:27:08 -0700 (PDT)
>>>>>>>     (envelope-from Yohann@nic.za.net)
>>>>>>> X-Virus-Scanned: amavisd-new at netconsonance.com
>>>>>>> X-Spam-Score: 2.033
>>>>>>> X-Spam-Level: **
>>>>>>> X-Spam-Status: No, score=2.033 tagged_above=-999 required=4
>>>>>>>     tests=[DK_POLICY_SIGNSOME=0.001, HTML_MESSAGE=0.001,
>>>>>>>     MIME_HTML_MOSTLY=0.699, RCVD_IN_BL_SPAMCOP_NET=1.332]
>>>>>>> Received: from x-6of7ca27m39al ([158.187.61.7]) by [84.21.29.58] 
>>>>>>> with Microsoft SMTPSVC(6.0.3790.1830);
>>>>>>>     Tue, 14 Aug 2007 15:27:01 +0200
>>>>>>> Message-ID: <00...@x6of7ca27m39al>
>>>>>>> From: "Yohann michels" <Yo...@nic.za.net>
>>>>>>> To: jrhett@lizardarts.com
>>>>>>> Subject: bill-jrhett
>>>>>>> Date: Tue, 14 Aug 2007 15:26:28 +0200
>>>>>>> MIME-Version: 1.0
>>>>>>> Content-Type: multipart/mixed;
>>>>>>>     boundary="----=_NextPart_000_000E_01C7DE87.7C1E24D0"
>>>>>>> X-Priority: 3
>>>>>>> X-MSMail-Priority: Normal
>>>>>>> X-Mailer: Microsoft Outlook Express 6.00.2900.3138
>>>>>>> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
>>>>>>>
>>>>>>>
>>>>>>> ------=_NextPart_000_000E_01C7DE87.7C1E24D0
>>>>>>> Content-Type: multipart/alternative;
>>>>>>>     boundary="----=_NextPart_001_000F_01C7DE87.7C1E24D0"
>>>>>>>
>>>>>>>
>>>>>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0
>>>>>>> Content-Transfer-Encoding: quoted-printable
>>>>>>> Content-Type: text/plain;
>>>>>>>     charset=windows-1250
>>>>>>>
>>>>>>>
>>>>>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0
>>>>>>> Content-Transfer-Encoding: quoted-printable
>>>>>>> Content-Type: text/html;
>>>>>>>     charset=windows-1250
>>>>>>>
>>>>>>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
>>>>>>> <HTML><HEAD>
>>>>>>> <META http-equiv=3DContent-Type content=3D"text/html; =
>>>>>>> charset=3Dwindows-1250">
>>>>>>> <META content=3D"MSHTML 6.00.2900.3132" name=3DGENERATOR>
>>>>>>> <STYLE></STYLE>
>>>>>>> </HEAD>
>>>>>>> <BODY bgColor=3D#ffffff>
>>>>>>> <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV></BODY></HTML>
>>>>>>>
>>>>>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0--
>>>>>>>
>>>>>>> ------=_NextPart_000_000E_01C7DE87.7C1E24D0
>>>>>>> Content-Transfer-Encoding: base64
>>>>>>> Content-Type: application/octet-stream;
>>>>>>>     name=marketing-jrhett.pdf
>>>>>>> Content-Disposition: attachment;
>>>>>>>     filename=marketing-jrhett.pdf
>>>>>>>
>>>>>>> JVBERi0xLjUNJeLjz9MNCjIyIDAgb2JqPDwvSFs0MzYgMTQ4XS9MaW5lYXJpemVkIDEvRSAxNjU5 
>>>>>>> L0wgMTM1NzYvTiAxMC9PIDI2L1QgMTMwNzQ+Pg1lbmRvYmoNICAgICAgICAgICAgICAgICAgICAg 
>>>>>>> *snip*
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Jo Rhett
>>>>>> Net Consonance ... net philanthropy, open source and other randomness
>>>>>
>>>
>>> -- 
>>> Jo Rhett
>>> Net Consonance ... net philanthropy, open source and other randomness
> 


-- 
Jo Rhett
Net Consonance ... net philanthropy, open source and other randomness

Re: PDF rule not matching -- split line content type?

Posted by Theo Van Dinter <fe...@apache.org>.
FWIW, I responded a few days ago with an explanation of why the rule isn't
hitting.  It has nothing to do with content-type headers and everything to do
with the fact that the message body isn't empty, there's HTML content.


On Thu, Aug 16, 2007 at 10:15:03AM +0100, Justin Mason wrote:
> 
> Jo --
> 
> I've checked that in as 'TVD_PDF_FINGER01_JO'.  You can track its progress
> at http://ruleqa.SpamAssassin.org .
> 
> by the way -- it's pretty easy for you to test your own rules in your own
> environment, actually, and I recommend you try it out.  These are the
> tools we use:
> 
>   http://wiki.apache.org/spamassassin/MassCheck
>   http://wiki.apache.org/spamassassin/HitFrequencies
> 
> They are bundled with SpamAssassin in the "masses" folder.  All the
> documentation is there on the wiki.
> 
> --j.
> 
> Jo Rhett writes:
> > Since nobody is paying attention, let me clarify.  The current rule is 
> > wrong:
> > 
> > mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
> > mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
> > /^application\/octet-stream.*\.pdf/i
> > 
> > meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT_TP && 
> > __TVD_MIME_ATT && !__TVD_BODY
> > 
> > This evaluates to exactly the same as this:
> > 
> > meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT_TP && !__TVD_BODY
> > 
> > I believe that the original rule's intent was this:
> > 
> > meta TVD_PDF_FINGER01  __TVD_MIME_CT_MM && __TVD_MIME_ATT && !__TVD_BODY
> > 
> > Can someone with commit rights please test and commit this change?
> > Thank you.
> > 
> > Jo Rhett wrote:
> > > Well actually I think the rule has a bug.  Why OR the two mime types as 
> > > a new meta, and then require one of the two in the final meta?   The net 
> > > effect is that if ATT_TP is true it matches, but if ATT_AOPDF is true it 
> > > will never match.
> > > 
> > > I believe that the following will work better - work in every situation 
> > > that it worked before, and not fail when the mime type is octet-stream:
> > >    meta TVD_PDF_FINGER01           __TVD_MIME_CT_MM && __TVD_MIME_ATT && 
> > > !__TVD_BODY
> > > 
> > > Would someone kindly evaluate this change and possibly fix the rule?  
> > > Thanks.
> > > 
> > > On Aug 14, 2007, at 10:41 PM, Loren Wilton wrote:
> > >>>> rawbody __TVD_BODY              /\S{4}/
> > >>
> > >> true
> > >>
> > >>>> header __TVD_MIME_CT_MM         Content-Type =~ /^multipart\/mixed/i
> > >>
> > >> true
> > >>
> > >>>> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
> > >>
> > >> false
> > >>
> > >>>> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
> > >>>> /^application\/octet-stream.*\.pdf/i
> > >>
> > >> maybe true, maybe not.  I would hope newlines were translated to 
> > >> spaces by the mimehdr plugin, but maybe they weren't.  Try /is instead 
> > >> of /i and see if it helps.
> > >>
> > >>>> meta __TVD_MIME_ATT             __TVD_MIME_ATT_AP || 
> > >>>> __TVD_MIME_ATT_AOPDF
> > >>
> > >> maybe true
> > >>
> > >>>> meta TVD_PDF_FINGER01
> > >>    __TVD_MIME_CT_MM
> > >> true
> > >>    && __TVD_MIME_ATT_TP
> > >> undefined here, can't say
> > >>    && __TVD_MIME_ATT
> > >> maybe true
> > >>    && !__TVD_BODY
> > >> true
> > >>
> > >> So, not knowing what is in __TVD_MIME_ATT_TP, I haven't a clue if it 
> > >> will fire, since that is part of an 'and'.  If I assume it to be true 
> > >> then I'm still not sure because of the multiline possibility in 
> > >> __TVD_MIME_ATT.
> > >>
> > >>        Loren
> > >>
> > >>>> describe TVD_PDF_FINGER01       Mail matches standard pdf spam 
> > >>>> fingerprint
> > >>
> > >>
> > >> ----- Original Message ----- From: "Jo Rhett" <jr...@netconsonance.com>
> > >> To: "SpamAssassin Users" <us...@spamassassin.apache.org>
> > >> Sent: Tuesday, August 14, 2007 10:16 PM
> > >> Subject: Re: PDF rule not matching -- split line content type?
> > >>
> > >>
> > >>> Can someone clue me in on why this rule isn't matching?
> > >>>
> > >>> Jo Rhett wrote:
> > >>>> So I've been getting a metric ton of PDF spam.  Investigating the 
> > >>>> rule that is supposed to match this, I see
> > >>>>
> > >>>> rawbody __TVD_BODY              /\S{4}/
> > >>>> header __TVD_MIME_CT_MM         Content-Type =~ /^multipart\/mixed/i
> > >>>> meta __TVD_MIME_ATT             __TVD_MIME_ATT_AP || 
> > >>>> __TVD_MIME_ATT_AOPDF
> > >>>> meta TVD_PDF_FINGER01           __TVD_MIME_CT_MM && 
> > >>>> __TVD_MIME_ATT_TP && __TVD_MIME_ATT && !__TVD_BODY
> > >>>> describe TVD_PDF_FINGER01       Mail matches standard pdf spam 
> > >>>> fingerprint
> > >>>>
> > >>>> mimeheader __TVD_MIME_ATT_AP    Content-Type =~ /^application\/pdf/i
> > >>>> mimeheader __TVD_MIME_ATT_AOPDF Content-Type =~ 
> > >>>> /^application\/octet-stream.*\.pdf/i
> > >>>>
> > >>>> The following message appears to match perfectly with this, except 
> > >>>> for perhaps that the content type is spread across two lines?  I 
> > >>>> haven't checked the code, but would this matter?
> > >>>>
> > >>>> Return-Path: <Yo...@nic.za.net>
> > >>>> Received: from mail.netconsonance.com ([unix socket])
> > >>>>      by triceratops.netconsonance.com (Cyrus v2.3.8) with LMTPA;
> > >>>>      Tue, 14 Aug 2007 06:27:16 -0700
> > >>>> Received: from [84.21.29.58] ([84.21.29.58])
> > >>>>     by mail.netconsonance.com (8.14.1/8.14.1) with ESMTP id 
> > >>>> l7EDR4UU095951
> > >>>>     for <jr...@lizardarts.com>; Tue, 14 Aug 2007 06:27:08 -0700 (PDT)
> > >>>>     (envelope-from Yohann@nic.za.net)
> > >>>> X-Virus-Scanned: amavisd-new at netconsonance.com
> > >>>> X-Spam-Score: 2.033
> > >>>> X-Spam-Level: **
> > >>>> X-Spam-Status: No, score=2.033 tagged_above=-999 required=4
> > >>>>     tests=[DK_POLICY_SIGNSOME=0.001, HTML_MESSAGE=0.001,
> > >>>>     MIME_HTML_MOSTLY=0.699, RCVD_IN_BL_SPAMCOP_NET=1.332]
> > >>>> Received: from x-6of7ca27m39al ([158.187.61.7]) by [84.21.29.58] 
> > >>>> with Microsoft SMTPSVC(6.0.3790.1830);
> > >>>>     Tue, 14 Aug 2007 15:27:01 +0200
> > >>>> Message-ID: <00...@x6of7ca27m39al>
> > >>>> From: "Yohann michels" <Yo...@nic.za.net>
> > >>>> To: jrhett@lizardarts.com
> > >>>> Subject: bill-jrhett
> > >>>> Date: Tue, 14 Aug 2007 15:26:28 +0200
> > >>>> MIME-Version: 1.0
> > >>>> Content-Type: multipart/mixed;
> > >>>>     boundary="----=_NextPart_000_000E_01C7DE87.7C1E24D0"
> > >>>> X-Priority: 3
> > >>>> X-MSMail-Priority: Normal
> > >>>> X-Mailer: Microsoft Outlook Express 6.00.2900.3138
> > >>>> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
> > >>>>
> > >>>>
> > >>>> ------=_NextPart_000_000E_01C7DE87.7C1E24D0
> > >>>> Content-Type: multipart/alternative;
> > >>>>     boundary="----=_NextPart_001_000F_01C7DE87.7C1E24D0"
> > >>>>
> > >>>>
> > >>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0
> > >>>> Content-Transfer-Encoding: quoted-printable
> > >>>> Content-Type: text/plain;
> > >>>>     charset=windows-1250
> > >>>>
> > >>>>
> > >>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0
> > >>>> Content-Transfer-Encoding: quoted-printable
> > >>>> Content-Type: text/html;
> > >>>>     charset=windows-1250
> > >>>>
> > >>>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> > >>>> <HTML><HEAD>
> > >>>> <META http-equiv=3DContent-Type content=3D"text/html; =
> > >>>> charset=3Dwindows-1250">
> > >>>> <META content=3D"MSHTML 6.00.2900.3132" name=3DGENERATOR>
> > >>>> <STYLE></STYLE>
> > >>>> </HEAD>
> > >>>> <BODY bgColor=3D#ffffff>
> > >>>> <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV></BODY></HTML>
> > >>>>
> > >>>> ------=_NextPart_001_000F_01C7DE87.7C1E24D0--
> > >>>>
> > >>>> ------=_NextPart_000_000E_01C7DE87.7C1E24D0
> > >>>> Content-Transfer-Encoding: base64
> > >>>> Content-Type: application/octet-stream;
> > >>>>     name=marketing-jrhett.pdf
> > >>>> Content-Disposition: attachment;
> > >>>>     filename=marketing-jrhett.pdf
> > >>>>
> > >>>> JVBERi0xLjUNJeLjz9MNCjIyIDAgb2JqPDwvSFs0MzYgMTQ4XS9MaW5lYXJpemVkIDEvRSAxNjU5 
> > >>>> L0wgMTM1NzYvTiAxMC9PIDI2L1QgMTMwNzQ+Pg1lbmRvYmoNICAgICAgICAgICAgICAgICAgICAg 
> > >>>> *snip*
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>> -- 
> > >>> Jo Rhett
> > >>> Net Consonance ... net philanthropy, open source and other randomness
> > >>
> > >>
> > > 
> > 
> > 
> > -- 
> > Jo Rhett
> > Net Consonance ... net philanthropy, open source and other randomness

-- 
Randomly Selected Tagline:
"Premature optimisation is the root of all evil." - Knuth