You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jack Gostl <go...@argoscomp.com> on 2006/03/07 12:55:13 UTC
All image spam
I've seen some references to this in threads, but I didn't see an answer.
Starting in late November, we started getting hit with spam that was almost entirely a jpeg. They seem to be mostly "stock recommendations". There is minimal message, usually HTML, and the real spam content is in the image. Despite al the trainging that I do, this seems to slip through the Bayes algorithms with no more than a 50%, and the rest of the tests don't drive the score up high enough to help.
I am currently running SpamAssassin 3.0.3. I tried running these messages through SpamAssassin 3.1 and it doesn't seem to help.
Any suggestions?
Thanks - Jack
Re: All image spam
Posted by Craig Baird <cr...@xpressweb.com>.
I'm having similar results here. As others have mentioned, the SARE stock
rules do help somewhat, but it's by no means the proverbial "silver bullet".
As someone else also mentioned, it helps to increase the HTML_IMAGE_ONLY_XX
rules. I increased 12,16,20, and 24 by one point each. However, that still
doesn't nail all of them. I have seen some come through without even hitting
any HTML_IMAGE_ONLY_XX rules.
It seems to me that with these image-only spams, spammers may have finally
stumbled onto a pretty good weapon to counter SA, and to defeat Bayes. With
broadband connections being dirt cheap these days, and with all the zombie
nets at their disposal, spammers can now blast out large spams in a short
amount of time, without causing much drain on their own network resources.
I'm getting image-only spam with attachments ranging in size from about 12K to
70K.
I'll bet it's only a matter of time before we start seeing spam larger than
256K, which I believe is the threshold that most people use to determine
whether to send a message to SA for scanning or not. We'll probably all be
bumping up that threshold at some point. :(
Craig
Quoting Jack Gostl <go...@argoscomp.com>:
> I've seen some references to this in threads, but I didn't see an answer.
>
> Starting in late November, we started getting hit with spam that was almost
> entirely a jpeg. They seem to be mostly "stock recommendations". There is
> minimal message, usually HTML, and the real spam content is in the image.
> Despite al the trainging that I do, this seems to slip through the Bayes
> algorithms with no more than a 50%, and the rest of the tests don't drive the
> score up high enough to help.
>
> I am currently running SpamAssassin 3.0.3. I tried running these messages
> through SpamAssassin 3.1 and it doesn't seem to help.
>
> Any suggestions?
>
> Thanks - Jack
>
Re: Image MD5sums available, was Re: All image spam
Posted by Dirk Bonengel <di...@bonengel.de>.
Hi, all,
I wonder if the iXhash Plugin I did last summer would catch these.
FYI, the plugin uses some form(s) of fuzzy MD5 checksums of the complete
mail body (not seperate mime parts) and does compare the results with
those I provide via DNS.
It's available at http://wiki.apache.org/spamassassin/iXhash.
If not, enhancing it to also compute checksums of attachments would be
nice to have. If only I had the time...
Dirk
William Stearns schrieb:
> Good evening, Jack, all,
>
> On Tue, 7 Mar 2006, Jack Gostl wrote:
>
>> I've seen some references to this in threads, but I didn't see an
>> answer.
>>
>> Starting in late November, we started getting hit with spam that was
>> almost entirely a jpeg. They seem to be mostly "stock
>> recommendations". There is minimal message, usually HTML, and the
>> real spam content is in the image. Despite al the trainging that I
>> do, this seems to slip through the Bayes algorithms with no more than
>> a 50%, and the rest of the tests don't drive the score up high enough
>> to help.
>>
>> I am currently running SpamAssassin 3.0.3. I tried running these
>> messages through SpamAssassin 3.1 and it doesn't seem to help.
>>
>> Any suggestions?
>
> We talked about identifying images last summer. There are a few
> answers, some of which have been discussed in this thread already.
> Razor, pyzor, and DCC are designed to score up messages with
> already-seen mime parts (read: if 3 other people think that image is
> spam, your spam filter can score it up). As with identifying text
> parts where the spammer inserts random words to throw those services
> off, images can be subtly modified so the visible area is essentially
> identical but the actual image file is different with every spam run.
> I offered to put together a catalog of checksums of images used in
> spam, and have done so. The md5 and sha1 sums of 44,522 spam images
> can be found at http://www.stearns.org/spamattach/ , broken out by
> category and in combined files. If anyone wants to take on an
> interesting project of computing the md5 checksums of attachments, I'd
> be willing to set those lists up as a dns-queriable rbl (along the
> lines of
> 01f5ff6ab05499c94a967409204e6a29.md5.some_rbl.net which would return
> 127.0.0.2 if known, nothing if not).
> I already understand the downsides to this approach (duplicates
> work of razor, pyzor, and dcc, images can be altered), but figure the
> checksum work has already been done and will continue to be done anyways.
> Anyone up for it?
> Cheers,
> - Bill
>
> ---------------------------------------------------------------------------
>
> "That man is a success who lived well, laughed often and loved
> much: who has gained the respect of intelligent men and the love of
> children: who has filled his niche and accomplished his task: who leaves
> the world a better place than he found it, whether by an improved poppy,
> a perfect poem or a rescued soul; who never lacked appreciation of
> earth's beauty or failed to express it; who looked for the best in
> others and gave the best he had."
> -- Robert Louis Stevenson.
> --------------------------------------------------------------------------
>
> William Stearns (wstearns@pobox.com). Mason, Buildkernel, freedups, p0f,
> rsync-backup, ssh-keyinstall, dns-check, more at:
> http://www.stearns.org
> --------------------------------------------------------------------------
>
Image MD5sums available, was Re: All image spam
Posted by William Stearns <ws...@pobox.com>.
Good evening, Jack, all,
On Tue, 7 Mar 2006, Jack Gostl wrote:
> I've seen some references to this in threads, but I didn't see an
> answer.
>
> Starting in late November, we started getting hit with spam that was
> almost entirely a jpeg. They seem to be mostly "stock recommendations".
> There is minimal message, usually HTML, and the real spam content is in
> the image. Despite al the trainging that I do, this seems to slip
> through the Bayes algorithms with no more than a 50%, and the rest of
> the tests don't drive the score up high enough to help.
>
> I am currently running SpamAssassin 3.0.3. I tried running these
> messages through SpamAssassin 3.1 and it doesn't seem to help.
>
> Any suggestions?
We talked about identifying images last summer. There are a few
answers, some of which have been discussed in this thread already.
Razor, pyzor, and DCC are designed to score up messages with
already-seen mime parts (read: if 3 other people think that image is spam,
your spam filter can score it up). As with identifying text parts where
the spammer inserts random words to throw those services off, images can
be subtly modified so the visible area is essentially identical but the
actual image file is different with every spam run.
I offered to put together a catalog of checksums of images used in
spam, and have done so. The md5 and sha1 sums of 44,522 spam images can
be found at http://www.stearns.org/spamattach/ , broken out by category
and in combined files. If anyone wants to take on an interesting project
of computing the md5 checksums of attachments, I'd be willing to set those
lists up as a dns-queriable rbl (along the lines of
01f5ff6ab05499c94a967409204e6a29.md5.some_rbl.net which would return
127.0.0.2 if known, nothing if not).
I already understand the downsides to this approach (duplicates
work of razor, pyzor, and dcc, images can be altered), but figure the
checksum work has already been done and will continue to be done anyways.
Anyone up for it?
Cheers,
- Bill
---------------------------------------------------------------------------
"That man is a success who lived well, laughed often and loved
much: who has gained the respect of intelligent men and the love of
children: who has filled his niche and accomplished his task: who leaves
the world a better place than he found it, whether by an improved poppy,
a perfect poem or a rescued soul; who never lacked appreciation of
earth's beauty or failed to express it; who looked for the best in
others and gave the best he had."
-- Robert Louis Stevenson.
--------------------------------------------------------------------------
William Stearns (wstearns@pobox.com). Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at: http://www.stearns.org
--------------------------------------------------------------------------
Re: All image spam
Posted by Loren Wilton <lw...@earthlink.net>.
> Any suggestions?
The SARE stock rules. They won't catch all of 'em, but they will catch a lot.
Loren
Re: All image spam
Posted by le...@srs.gov.
We jacked up the scoring on HTML_IMAGE_ONLY_12 to a 5, and are catching
about 90% of these now with almost no false positives.
"Jack Gostl" <go...@argoscomp.com>
03/07/2006 07:26 AM
To
<us...@spamassassin.apache.org>
cc
Subject
All image spam
I've seen some references to this in threads, but I didn't see an answer.
Starting in late November, we started getting hit with spam that was
almost entirely a jpeg. They seem to be mostly "stock recommendations".
There is minimal message, usually HTML, and the real spam content is in
the image. Despite al the trainging that I do, this seems to slip through
the Bayes algorithms with no more than a 50%, and the rest of the tests
don't drive the score up high enough to help.
I am currently running SpamAssassin 3.0.3. I tried running these messages
through SpamAssassin 3.1 and it doesn't seem to help.
Any suggestions?
Thanks - Jack
RE: All image spam
Posted by Craig Baird <cr...@xpressweb.com>.
Quoting Martin Hepworth <ma...@solid-state-logic.com>:
> Jack
>
> If you turn on the URI-RBLs in 3.1 (see v310.pre) you should see a
> reduction
> in this type of spam.
I don't think I've ever seen a URI in one of these... They purposely leave
out anything in the actual message body that could be used to block their
mail. All that is present in the message body is gibberish that typically
doesn't even trigger a significant Bayes score. All the spam content,
including any URIs is contained in the image.
Craig
RE: All image spam
Posted by Martin Hepworth <ma...@solid-state-logic.com>.
Jack
If you turn on the URI-RBLs in 3.1 (see v310.pre) you should see a reduction
in this type of spam.
--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
> -----Original Message-----
> From: Jack Gostl [mailto:gostl@argoscomp.com]
> Sent: 07 March 2006 11:55
> To: users@spamassassin.apache.org
> Subject: All image spam
>
> I've seen some references to this in threads, but I didn't see an answer.
>
> Starting in late November, we started getting hit with spam that was
> almost entirely a jpeg. They seem to be mostly "stock recommendations".
> There is minimal message, usually HTML, and the real spam content is in
> the image. Despite al the trainging that I do, this seems to slip through
> the Bayes algorithms with no more than a 50%, and the rest of the tests
> don't drive the score up high enough to help.
>
> I am currently running SpamAssassin 3.0.3. I tried running these messages
> through SpamAssassin 3.1 and it doesn't seem to help.
>
> Any suggestions?
>
> Thanks - Jack
>
**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.
**********************************************************************