You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Pedro David Marco <pe...@yahoo.com> on 2017/08/11 18:28:56 UTC

Attachments with no Content-Type mime header

Hi everybody...
When an email has a MIME part with no Content-Type header, is there any way to force SA "guess" the format based on other criteria... file extension, for example?
Example:
Content-Disposition: attachment; filename="details.pdf"Content-Transfer-Encoding: base64

Thanks!
----PedroD

Re: Attachments with no Content-Type mime header

Posted by Paul Stead <pa...@zeninternet.co.uk>.
Attached is a very basic example of discovering if an email has a PDF attachment by example of Magic Number only.

Hope this works/helps

Paul

From: Pedro David Marco <pe...@yahoo.com>
Reply-To: Pedro David Marco <pe...@yahoo.com>
Date: Thursday, 17 August 2017 at 09:41
To: Paul Stead <pa...@zeninternet.co.uk>, "users@spamassassin.apache.org" <us...@spamassassin.apache.org>
Subject: Re: Attachments with no Content-Type mime header

Thanks Paul...

it is weird...

the documentation says:

find_parts()
Used to search the tree for specific MIME parts. An array of matching Node objects (pointers into the tree) is returned. The parameters that can be passed in are (in order, all scalars):
Regexp - Used to match against each part's Content-Type header, specifically the type and not the rest of the header.

So the regex applies to only the Content-Type and when i have tested it works like the doc says...

Paul, please would it be possible for you to send me the sample you have tried with?

Thanks!

----
PedroD

________________________________
From: Paul Stead <pa...@zeninternet.co.uk>
To: Pedro David Marco <pe...@yahoo.com>; "users@spamassassin.apache.org" <us...@spamassassin.apache.org>
Sent: Thursday, August 17, 2017 1:17 AM
Subject: Re: Attachments with no Content-Type mime header

I’ve checked and as in the plugin,

  foreach my $part ($pms->{msg}->find_parts(qr/./, 1)) {

does find each attachment, including the ones without Content-Type header – the method below can be used on these parts found regardless of lack of Content-Type

Paul

From: Pedro David Marco <pe...@yahoo.com>
Reply-To: Pedro David Marco <pe...@yahoo.com>
Date: Wednesday, 16 August 2017 at 23:49
To: Paul Stead <pa...@zeninternet.co.uk>, "users@spamassassin.apache.org" <us...@spamassassin.apache.org>
Subject: Re: Attachments with no Content-Type mime header

Thanks Paul,

but your plugin uses find_parts() that turns it pointless if there is no Content-Type mime header...


--------
PedroD


>The magic number or file signature can be helpful in determining the filetype:

>https://en.wikipedia.org/wiki/List_of_file_signatures

>I make use of this in the OLEMacro plugin: https://github.com/fmbla/spamassassin-olemacro/

>Paul Stead

--
Paul Stead
Systems Engineer
Zen Internet

--
Paul Stead
Systems Engineer
Zen Internet

Re: Attachments with no Content-Type mime header

Posted by Paul Stead <pa...@zeninternet.co.uk>.
This. With no Content-Type the type gets set to “text/plain” by default – should have maybe said this earlier, too

On 17/08/2017, 15:53, "RW" <rw...@googlemail.com> wrote:

    Have you ruled-out the possibility that the mime-type for such parts is
    set to the default mime type of text/plain?



--
Paul Stead
Systems Engineer
Zen Internet

Re: Attachments with no Content-Type mime header

Posted by RW <rw...@googlemail.com>.
On Thu, 17 Aug 2017 08:41:57 +0000 (UTC)
Pedro David Marco wrote:

> Thanks Paul...
> it is weird...
> the documentation says: 
>    
>    - find_parts()   
> 
>       - Used to search the tree for specific MIME parts. An array of
> matching Node objects (pointers into the tree) is returned. The
> parameters that can be passed in are (in order, all scalars):
>       - Regexp - Used to match against each part's Content-Type
> header, specifically the type and not the rest of the header.

Have you ruled-out the possibility that the mime-type for such parts is
set to the default mime type of text/plain?

Re: Attachments with no Content-Type mime header

Posted by Pedro David Marco <pe...@yahoo.com>.
Thanks Paul...
it is weird...
the documentation says: 
   
   - find_parts()   

      - Used to search the tree for specific MIME parts. An array of matching Node objects (pointers into the tree) is returned. The parameters that can be passed in are (in order, all scalars):
      - Regexp - Used to match against each part's Content-Type header, specifically the type and not the rest of the header.

So the regex applies to only the Content-Type and when i have tested it works like the doc says...
Paul, please would it be possible for you to send me the sample you have tried with?
Thanks!
----PedroD

      From: Paul Stead <pa...@zeninternet.co.uk>
 To: Pedro David Marco <pe...@yahoo.com>; "users@spamassassin.apache.org" <us...@spamassassin.apache.org> 
 Sent: Thursday, August 17, 2017 1:17 AM
 Subject: Re: Attachments with no Content-Type mime header
   
#yiv3786222887 #yiv3786222887 -- _filtered #yiv3786222887 {font-family:Arial;panose-1:2 11 6 4 2 2 2 2 2 4;} _filtered #yiv3786222887 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv3786222887 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv3786222887 {font-family:HelveticaNeue;}#yiv3786222887 #yiv3786222887 p.yiv3786222887MsoNormal, #yiv3786222887 li.yiv3786222887MsoNormal, #yiv3786222887 div.yiv3786222887MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv3786222887 a:link, #yiv3786222887 span.yiv3786222887MsoHyperlink {color:blue;text-decoration:underline;}#yiv3786222887 a:visited, #yiv3786222887 span.yiv3786222887MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv3786222887 p.yiv3786222887msonormal, #yiv3786222887 li.yiv3786222887msonormal, #yiv3786222887 div.yiv3786222887msonormal {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv3786222887 p.yiv3786222887msochpdefault, #yiv3786222887 li.yiv3786222887msochpdefault, #yiv3786222887 div.yiv3786222887msochpdefault {margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv3786222887 span.yiv3786222887msohyperlink {}#yiv3786222887 span.yiv3786222887msohyperlinkfollowed {}#yiv3786222887 span.yiv3786222887emailstyle17 {}#yiv3786222887 span.yiv3786222887msoins {}#yiv3786222887 p.yiv3786222887msonormal1, #yiv3786222887 li.yiv3786222887msonormal1, #yiv3786222887 div.yiv3786222887msonormal1 {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv3786222887 span.yiv3786222887msohyperlink1 {color:#0563C1;text-decoration:underline;}#yiv3786222887 span.yiv3786222887msohyperlinkfollowed1 {color:#954F72;text-decoration:underline;}#yiv3786222887 span.yiv3786222887emailstyle171 {font-family:Calibri;color:windowtext;}#yiv3786222887 span.yiv3786222887msoins1 {color:teal;text-decoration:underline;}#yiv3786222887 p.yiv3786222887msochpdefault1, #yiv3786222887 li.yiv3786222887msochpdefault1, #yiv3786222887 div.yiv3786222887msochpdefault1 {margin-right:0in;margin-left:0in;font-size:10.0pt;}#yiv3786222887 span.yiv3786222887EmailStyle29 {font-family:Calibri;color:windowtext;}#yiv3786222887 span.yiv3786222887msoIns {text-decoration:underline;color:teal;}#yiv3786222887 .yiv3786222887MsoChpDefault {font-size:10.0pt;} _filtered #yiv3786222887 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv3786222887 div.yiv3786222887WordSection1 {}#yiv3786222887 I’ve checked and as in the plugin,      foreach my $part ($pms->{msg}->find_parts(qr/./, 1)) {    does find each attachment, including the ones without Content-Type header – the method below can be used on these parts found regardless of lack of Content-Type    Paul    From: Pedro David Marco <pe...@yahoo.com>
Reply-To: Pedro David Marco <pe...@yahoo.com>
Date: Wednesday, 16 August 2017 at 23:49
To: Paul Stead <pa...@zeninternet.co.uk>, "users@spamassassin.apache.org" <us...@spamassassin.apache.org>
Subject: Re: Attachments with no Content-Type mime header    Thanks Paul,     but your plugin uses find_parts() that turns it pointless if there is no Content-Type mime header...         -------- PedroD      >The magic number or file signature can be helpful in determining the filetype:   >https://en.wikipedia.org/wiki/List_of_file_signatures   >I make use of this in the OLEMacro plugin: https://github.com/fmbla/spamassassin-olemacro/   >Paul Stead    --
Paul Stead
Systems Engineer
Zen Internet


   

Re: Attachments with no Content-Type mime header

Posted by Paul Stead <pa...@zeninternet.co.uk>.
I’ve checked and as in the plugin,

  foreach my $part ($pms->{msg}->find_parts(qr/./, 1)) {

does find each attachment, including the ones without Content-Type header – the method below can be used on these parts found regardless of lack of Content-Type

Paul

From: Pedro David Marco <pe...@yahoo.com>
Reply-To: Pedro David Marco <pe...@yahoo.com>
Date: Wednesday, 16 August 2017 at 23:49
To: Paul Stead <pa...@zeninternet.co.uk>, "users@spamassassin.apache.org" <us...@spamassassin.apache.org>
Subject: Re: Attachments with no Content-Type mime header

Thanks Paul,

but your plugin uses find_parts() that turns it pointless if there is no Content-Type mime header...


--------
PedroD


>The magic number or file signature can be helpful in determining the filetype:

>https://en.wikipedia.org/wiki/List_of_file_signatures

>I make use of this in the OLEMacro plugin: https://github.com/fmbla/spamassassin-olemacro/

>Paul Stead

--
Paul Stead
Systems Engineer
Zen Internet

Re: Attachments with no Content-Type mime header

Posted by Pedro David Marco <pe...@yahoo.com>.
Thanks Paul, 
but your plugin uses find_parts() that turns it pointless if there is no Content-Type mime header...  

--------PedroD
    >The magic number or file signature can be helpful in determining the filetype:    >https://en.wikipedia.org/wiki/List_of_file_signatures    >I make use of this in the OLEMacro plugin: https://github.com/fmbla/spamassassin-olemacro/    >Paul Stead

   #yiv6466611010 #yiv6466611010 -- _filtered #yiv6466611010 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv6466611010 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv6466611010 {font-family:HelveticaNeue;}#yiv6466611010 #yiv6466611010 p.yiv6466611010MsoNormal, #yiv6466611010 li.yiv6466611010MsoNormal, #yiv6466611010 div.yiv6466611010MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv6466611010 a:link, #yiv6466611010 span.yiv6466611010MsoHyperlink {color:#0563C1;text-decoration:underline;}#yiv6466611010 a:visited, #yiv6466611010 span.yiv6466611010MsoHyperlinkFollowed {color:#954F72;text-decoration:underline;}#yiv6466611010 span.yiv6466611010EmailStyle17 {font-family:Calibri;color:windowtext;}#yiv6466611010 span.yiv6466611010msoIns {text-decoration:underline;color:teal;}#yiv6466611010 .yiv6466611010MsoChpDefault {font-size:10.0pt;} _filtered #yiv6466611010 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv6466611010 div.yiv6466611010WordSection1 {}#yiv6466611010 

Re: Attachments with no Content-Type mime header

Posted by Paul Stead <pa...@zeninternet.co.uk>.

From: Pedro David Marco <pe...@yahoo.com>
Reply-To: Pedro David Marco <pe...@yahoo.com>
Date: Wednesday, 16 August 2017 at 22:32
To: David Niklas <do...@mail.com>, "users@spamassassin.apache.org" <us...@spamassassin.apache.org>
Subject: Re: Attachments with no Content-Type mime header

Hi David...

I agree with you... but some functions like find_parts() do not work if there are not Content-Type Headers... making impossible the analysis of some attachments...

i am writing a plugin to detect suspicious PDFs...

Maybe there's a better way to analyze attachments that using find_parts()....

Thanks!

------
PedroD


The magic number or file signature can be helpful in determining the filetype:

https://en.wikipedia.org/wiki/List_of_file_signatures

I make use of this in the OLEMacro plugin: https://github.com/fmbla/spamassassin-olemacro/

--
Paul Stead
Systems Engineer
Zen Internet

Re: Attachments with no Content-Type mime header

Posted by Pedro David Marco <pe...@yahoo.com>.
Hi David...
I agree with you... but some functions like find_parts() do not work if there are not Content-Type Headers... making impossible the analysis of some attachments...
i am writing a plugin to detect suspicious PDFs...
Maybe there's a better way to analyze attachments that using find_parts()....
Thanks!
------PedroD


>You should not trust what the files extension says that the file is. Also
>file(1) does not yet do a good enough job to be reliable this way.

>As for guessing, I think that the best guess that could be applied would
>be a test of the file to see if, once decoded, it is a utf-8 encoded,
>ASCII, or iso8859-X encoded text file. Failing that I would assume it is
>either an MS doc/ppt/spreadsheet/etc, pdf file, or pure binary. Then you
>could try trusting the file extension.
>Otherwise, it is a text file and could contain an innocent html or
>an uncompressed ps file or a dangerous JS infection program.
>Either way I'd be really careful.
>
>What is your use case?
>What do you intend to do with a pdf file vs. an html one?
>
>Sincerely,
>David


   

Re: Attachments with no Content-Type mime header

Posted by David Niklas <do...@mail.com>.
On Fri, 11 Aug 2017 18:28:56 +0000 (UTC)
Pedro David Marco <pe...@yahoo.com> wrote:

> Hi everybody...
> When an email has a MIME part with no Content-Type header, is there any
> way to force SA "guess" the format based on other criteria... file
> extension, for example? Example: Content-Disposition: attachment;
> filename="details.pdf"Content-Transfer-Encoding: base64
> 
> Thanks!
> ----PedroD

You should not trust what the files extension says that the file is. Also
file(1) does not yet do a good enough job to be reliable this way.

As for guessing, I think that the best guess that could be applied would
be a test of the file to see if, once decoded, it is a utf-8 encoded,
ASCII, or iso8859-X encoded text file. Failing that I would assume it is
either an MS doc/ppt/spreadsheet/etc, pdf file, or pure binary. Then you
could try trusting the file extension.
Otherwise, it is a text file and could contain an innocent html or
an uncompressed ps file or a dangerous JS infection program.
Either way I'd be really careful.

What is your use case?
What do you intend to do with a pdf file vs. an html one?

Sincerely,
David

Re: Attachments with no Content-Type mime header

Posted by RW <rw...@googlemail.com>.
On Fri, 18 Aug 2017 02:38:19 -0400
Rupert Gallagher wrote:

> What is the problem that you wish to solve?
> 
> The purpose of SA is to flag SPAM. In this case, you already have all
> the information you need, because subtype specification is MANDATORY.
> There are no default subtypes. SA must flag the email, because it is
> not compliant 

A mime section without a Content-Type header is assumed to be
"text/plain; charset=us-ascii", so it has a subtype of "plain".


> to RFC 822.

RFC 822 doesn't say anything about MIME.


Re: Attachments with no Content-Type mime header

Posted by Rupert Gallagher <ru...@protonmail.com>.
Double checked. Thank you for the correction.

http://www.w3.org/Protocols/rfc1341/4_Content-Type.html

Sent from ProtonMail Mobile

On Fri, Aug 18, 2017 at 7:24 PM, John Hardin <jh...@impsec.org> wrote:

> On Fri, 18 Aug 2017, Rupert Gallagher wrote: > The purpose of SA is to flag SPAM. Correct. > In this case, you already have all the information you need, because > subtype specification is MANDATORY. There are no default subtypes. ok. > SA must flag the email, because it is not compliant to RFC 822. Not true. As has been said before, SA is not an RFC-compliance auditing tool. For SA, RFC compliance checks are only useful insofar as non-compliance is an indicator of spam. The omission of a MIME Content-Type may be a spam indicator, or it may simply be an indicator of a sloppy MUA implementation that takes advantage of Postel's principle. -- John Hardin KA7OHZ http://www.impsec.org/~jhardin/ jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 ----------------------------------------------------------------------- Watch... Wallet... Gun... Knee... -- Denny Crane ----------------------------------------------------------------------- 6 days until the 1938th anniversary of the destruction of Pompeii

Re: Attachments with no Content-Type mime header

Posted by John Hardin <jh...@impsec.org>.
On Fri, 18 Aug 2017, Rupert Gallagher wrote:

> The purpose of SA is to flag SPAM.

Correct.

> In this case, you already have all the information you need, because 
> subtype specification is MANDATORY. There are no default subtypes.

ok.

> SA must flag the email, because it is not compliant to RFC 822.

Not true.

As has been said before, SA is not an RFC-compliance auditing tool. For 
SA, RFC compliance checks are only useful insofar as non-compliance is an 
indicator of spam.

The omission of a MIME Content-Type may be a spam indicator, or it may 
simply be an indicator of a sloppy MUA implementation that takes advantage 
of Postel's principle.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Watch... Wallet... Gun... Knee...                    -- Denny Crane
-----------------------------------------------------------------------
  6 days until the 1938th anniversary of the destruction of Pompeii

Re: Attachments with no Content-Type mime header

Posted by Rupert Gallagher <ru...@protonmail.com>.
What is the problem that you wish to solve?

The purpose of SA is to flag SPAM. In this case, you already have all the information you need, because subtype specification is MANDATORY. There are no default subtypes. SA must flag the email, because it is not compliant to RFC 822.

Sent from ProtonMail Mobile

On Fri, Aug 11, 2017 at 8:28 PM, Pedro David Marco <pe...@yahoo.com> wrote:

> Hi everybody...
>
> When an email has a MIME part with no Content-Type header, is there any way to force SA "guess" the format based on other criteria... file extension, for example?
>
> Example:
>
> Content-Disposition: attachment;
> filename="details.pdf"
> Content-Transfer-Encoding: base64
>
> Thanks!
>
> ----
> PedroD