You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Robert Boyl <ro...@gmail.com> on 2016/08/03 12:13:31 UTC

scan an HTML file, possible?

Hi, everyone

I have a very nice regex a friend passed me that catches those emails that
have an HTML attached with a redirect html command to some malefic website.

He has some tool in Exim that scans text in attachments. But I wanted to
use a spamassassin rule.

Is there some plugin/way in Spamassassin to scan text of an html attachment?

Thanks!
Rob

Re: scan an HTML file, possible?

Posted by Dave Funk <db...@engineering.uiowa.edu>.
On Wed, 3 Aug 2016, Robert Boyl wrote:

> Hi, everyone
> 
> I have a very nice regex a friend passed me that catches those emails that have an HTML attached with a redirect html command to
> some malefic website.
> 
> He has some tool in Exim that scans text in attachments. But I wanted to use a spamassassin rule.
> 
> Is there some plugin/way in Spamassassin to scan text of an html attachment?
>

You can write 'full' rules that will work with raw HTML in recognized html 
attachments. The problem is that SA has business logic that ignores 
non-textural attachments, and that can be fooled by mime-typing.

So if the attachment has a mime-type of "text/html" SA will scan it.
If it has a mime-type of "application/octet-stream" SA will ignore it but 
if the attachment has a filename ending in ".htm" most client programs 
will treat it as HTML and open it as such.

I once wrote a rule to detect such obfuscation but it had too many FPs.

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: scan an HTML file, possible?

Posted by Benny Pedersen <me...@junc.eu>.
On 2016-08-03 14:13, Robert Boyl wrote:

> I have a very nice regex a friend passed me that catches those emails
> that have an HTML attached with a redirect html command to some
> malefic website.

bravo

> He has some tool in Exim that scans text in attachments. But I wanted
> to use a spamassassin rule.

clamav ?

google foxhole 3dr party signature, tell clamav-milter to accept virus, 
and then in mta stage REJECT official virus, what is left is unofficiel 
sigs to be spam handled later

> Is there some plugin/way in Spamassassin to scan text of an html
> attachment?

yes clamav plugin, its just ram hungry, so take the clamav-milter way

in spamassassin then create rules based on clamav-milter headers

just remember to in mta stage reject virus