You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Henrik K <he...@hege.li> on 2022/04/26 13:04:13 UTC

Re: your mail

On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
> Hello,
> 
> is it possible to match message headers in rfc822 atttachments?
> 
> from what I know, "header" rules only apply to mail headers and mimeheader
> only apply to mime headers.
> 
> body and rawbody afaik only search in bodies of messages or included
> messages.
> 
> I have asked some time ago but no success:
> 
> https://marc.info/?l=spamassassin-users&m=132282473328809&w=2
> 
> is this possible now or do we need out-of SA solution for this?

full FOO /\nContent-Type: message\/rfc822.*?\nReceived:(?:[^\n]+\n\s+){0,3}[^\n]*\b1.2.3.4\b/s


Re: your mail

Posted by Kris Deugau <kd...@vianet.ca>.
Matus UHLAR - fantomas wrote:
>>> On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
>>> > is it possible to match message headers in rfc822 atttachments?
>>> >
>>> > from what I know, "header" rules only apply to mail headers and 
>>> mimeheader
>>> > only apply to mime headers.
>>> >
>>> > body and rawbody afaik only search in bodies of messages or included
>>> > messages.

> On 26.04.22 16:11, Henrik K wrote:
>> Maybe a bit safer version that doesn't log huge strings and run wild
>>
>> full FOO /^(?=.*?\nContent-Type: 
>> message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s 
>>
> 
> Doesn't this requires mime headers in specific order that may not be 
> fullfilled?

If your attached message has headers that are mixed in with the MIME 
headers then it's badly (arguably maliciously) structured and probably 
not sanely parseable.

Pulling a quick sample from the spam reporting account here:

====

------=_NextPart_000_0011_01D858C3.7B7DAB10
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment

Received: from vmsa103.odn.ne.jp by cmsa103.odn.ne.jp with ESMTP id
  <20...@msa103.odn.ne.jp> for
[...and the rest of the attached message...]

====

However, Henrik's {0,1024} safety barrier is unfortunately likely to 
skip intended matches because of huge/multiple DKIM signatures and other 
odds and ends that certain mail clients or platforms take delight in 
stuffing into email headers.  I've lost count of the ones I've seen with 
~40-50k+ characters just in the message headers, never mind all the 
Stupid found in the message body.  (I think the record has to be 
something like 200k+ for a one-line message with no embedded images. 
Yay progress?)

Can you expand some more on your use case?  You may be better off 
splitting the attached message off outside of SA (which is relatively 
simple[0]) and processing it directly.  If there are attributes from the 
parent message needed when processing the child, your splitter could add 
them as pseudoheaders on the child message passed to SA.  Looking back 
at your previous post this seems likely to be easier than trying to 
wedge things fully inside SA.

-kgd
[0]  I'm slightly terrified by how many abuse departments at companies 
that should really know better, and be able to afford more and better 
talent than me at this kind of mail-mangling, do not seem to know what 
to do with an RFC822 attachment.  It took me less than a week to 
implement a fairly solid on-delivery splitter like this for FN and FP 
reporting, and I've since extended it to handle several mangled 
variations to the tune of maybe 5 hours or so each.

Re: your mail

Posted by Henrik K <he...@hege.li>.
On Tue, Apr 26, 2022 at 05:11:47PM +0300, Henrik K wrote:
> On Tue, Apr 26, 2022 at 03:59:36PM +0200, Matus UHLAR - fantomas wrote:
> > On 26.04.22 16:11, Henrik K wrote:
> > > Maybe a bit safer version that doesn't log huge strings and run wild
> > > 
> > > full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s
> > 
> > Doesn't this requires mime headers in specific order that may not be
> > fullfilled?
> 
> Well if you want to match rfc822 contents, it's Received: headers can only
> be after a rfc822 declaration.
> 
> Other than that it's up to you to figure out, since there's no samples.  Of
> course this doesn't replace a full parser, but as long as the stuff you
> receive doesn't vary much, there's no reason for it not to work.

.. as long as the whole rfc822 contents isn't base64 encoded. Probably not that common.


Re: your mail

Posted by Henrik K <he...@hege.li>.
On Tue, Apr 26, 2022 at 03:59:36PM +0200, Matus UHLAR - fantomas wrote:
> On 26.04.22 16:11, Henrik K wrote:
> > Maybe a bit safer version that doesn't log huge strings and run wild
> > 
> > full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s
> 
> Doesn't this requires mime headers in specific order that may not be
> fullfilled?

Well if you want to match rfc822 contents, it's Received: headers can only
be after a rfc822 declaration.

Other than that it's up to you to figure out, since there's no samples.  Of
course this doesn't replace a full parser, but as long as the stuff you
receive doesn't vary much, there's no reason for it not to work.


Re: your mail

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
>> On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
>> > is it possible to match message headers in rfc822 atttachments?
>> >
>> > from what I know, "header" rules only apply to mail headers and mimeheader
>> > only apply to mime headers.
>> >
>> > body and rawbody afaik only search in bodies of messages or included
>> > messages.
>> >
>> > I have asked some time ago but no success:
>> >
>> > https://marc.info/?l=spamassassin-users&m=132282473328809&w=2
>> >
>> > is this possible now or do we need out-of SA solution for this?

>On Tue, Apr 26, 2022 at 04:04:13PM +0300, Henrik K wrote:
>> full FOO /\nContent-Type: message\/rfc822.*?\nReceived:(?:[^\n]+\n\s+){0,3}[^\n]*\b1.2.3.4\b/s

On 26.04.22 16:11, Henrik K wrote:
>Maybe a bit safer version that doesn't log huge strings and run wild
>
>full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s

Doesn't this requires mime headers in specific order that may not be 
fullfilled?

-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Honk if you love peace and quiet.

Re: your mail

Posted by Henrik K <he...@hege.li>.
On Tue, Apr 26, 2022 at 04:04:13PM +0300, Henrik K wrote:
> On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
> > Hello,
> > 
> > is it possible to match message headers in rfc822 atttachments?
> > 
> > from what I know, "header" rules only apply to mail headers and mimeheader
> > only apply to mime headers.
> > 
> > body and rawbody afaik only search in bodies of messages or included
> > messages.
> > 
> > I have asked some time ago but no success:
> > 
> > https://marc.info/?l=spamassassin-users&m=132282473328809&w=2
> > 
> > is this possible now or do we need out-of SA solution for this?
> 
> full FOO /\nContent-Type: message\/rfc822.*?\nReceived:(?:[^\n]+\n\s+){0,3}[^\n]*\b1.2.3.4\b/s

Maybe a bit safer version that doesn't log huge strings and run wild

full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s