You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Sharma, Ashish" <as...@hp.com> on 2011/07/20 16:43:00 UTC

Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Hi,

I am currently using FuzzyOCR(3.6.0) for image spam control on my Spamassassin(3.3.1) stack.

The FuzzyOCR parent location (http://fuzzyocr.own-hero.net/wiki/Downloads) suggests the above FuzzyOCR is available only for testing on Spamassassin 3.2.x 

Somehow I am running this version of FuzzyOCR for my Spamassassin stack.

Lately I am not convinced with FuzzyOCR performance and the errors that I keep getting on it. 

Moreover the community support and active development on FuzzyOCR too seems to be missing.

Can someone suggest some better OCR plugin for Spamassassin 3.3.1 for image spam?

Thanks
Ashish Sharma

Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by Kris Deugau <kd...@vianet.ca>.
darxus@chaosreigns.com wrote:
> On 07/20, Sharma, Ashish wrote:
>> Can someone suggest some better OCR plugin for Spamassassin 3.3.1 for image spam?
>
> It still seems strange to me that anybody has ever bothered with using OCR
> to deal with image spam, when it's so easy, and for me not problematic, to
> just block all emails that might be image spam - those with an attached
> image that is embedded in the body of an html mail.

I have to ask - have you ever tried this in the context of an ISP mail 
system?

A great many users consider sending pictures and videos by email to be 
the ultimate purpose of email...  and many of the same set of users take 
great delight in (ab)using Outlook's "stationery" or using Incredimail, 
as well as overdosing on funny fonts and colours in the text.

-kgd

unsubscribe

Posted by Kristian Kirilov <d3...@d3v1ous.info>.
> On Thu, 21 Jul 2011 07:47:00 +0100
> "Sharma, Ashish" <as...@hp.com> wrote:
>
>> Can you please outline the other techniques that you use to catch
>> image spams?
>
> We find Bayes (we have our own implementation) and RBLs (again, we have
> our own) work pretty well.
>
> Regards,
>
> David.
>



Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by "David F. Skoll" <df...@roaringpenguin.com>.
On Thu, 21 Jul 2011 07:47:00 +0100
"Sharma, Ashish" <as...@hp.com> wrote:

> Can you please outline the other techniques that you use to catch
> image spams?

We find Bayes (we have our own implementation) and RBLs (again, we have
our own) work pretty well.

Regards,

David.

RE: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by "Sharma, Ashish" <as...@hp.com>.
David, 

>[We don't use OCR, as it happens.  We usually catch image spams anyway
>using other techniques.]

Can you please outline the other techniques that you use to catch image spams?

Thanks
Ashish Sharma

-----Original Message-----
From: David F. Skoll [mailto:dfs@roaringpenguin.com] 
Sent: Thursday, July 21, 2011 7:50 AM
To: users@spamassassin.apache.org
Subject: Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

On Wed, 20 Jul 2011 21:18:48 -0400
darxus@chaosreigns.com wrote:

> It still seems strange to me that anybody has ever bothered with
> using OCR to deal with image spam, when it's so easy, and for me not
> problematic, to just block all emails that might be image spam -
> those with an attached image that is embedded in the body of an html
> mail.

We receive many legitimate [sic] emails that use an embedded image
in that way.  Lots of companies think it's really cool to include their
logo in a .sig :(

> I've been very happily using this since 2006, and it completely made
> image spam go away.

Is this on a business account where it's critical for you to accept
email from... ahem... somewhat less-than-knowledgeable people?

> Inlined attached images are not a feature that I find anywhere near
> worth having enough to justify needing to OCR image spam.

Unfortunately, we can't block those.  The FP rate for us would be
horrendous.

[We don't use OCR, as it happens.  We usually catch image spams anyway
using other techniques.]

Regards,

David.


Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by "David F. Skoll" <df...@roaringpenguin.com>.
On Wed, 20 Jul 2011 21:18:48 -0400
darxus@chaosreigns.com wrote:

> It still seems strange to me that anybody has ever bothered with
> using OCR to deal with image spam, when it's so easy, and for me not
> problematic, to just block all emails that might be image spam -
> those with an attached image that is embedded in the body of an html
> mail.

We receive many legitimate [sic] emails that use an embedded image
in that way.  Lots of companies think it's really cool to include their
logo in a .sig :(

> I've been very happily using this since 2006, and it completely made
> image spam go away.

Is this on a business account where it's critical for you to accept
email from... ahem... somewhat less-than-knowledgeable people?

> Inlined attached images are not a feature that I find anywhere near
> worth having enough to justify needing to OCR image spam.

Unfortunately, we can't block those.  The FP rate for us would be
horrendous.

[We don't use OCR, as it happens.  We usually catch image spams anyway
using other techniques.]

Regards,

David.


Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by Axb <ax...@gmail.com>.
http://wiki.apache.org/spamassassin/UnmaintainedCustomPlugins

"OCR scanner and image validator SA-plugin"

"OCR Plugin"

may be worth a try.. no idea how well they work

<sarcasm>
The Spamassassin wiki is so cool
</sarcasm>


On 2011-07-21 8:53, Sharma, Ashish wrote:
> All,
>
> The current functionality requires me to receive mails that contains image and process them.
>
> So I want a good tool to deal with image spam.
>
> Please suggest some.
>
> Thanks
> Ashish Sharma
>
> -----Original Message-----
> From: Jason Bertoch [mailto:jason@i6ix.com]
> Sent: Thursday, July 21, 2011 8:03 AM
> To: users@spamassassin.apache.org
> Subject: Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam
>
> On 7/20/2011 9:18 PM, darxus@chaosreigns.com wrote:
>> On 07/20, Sharma, Ashish wrote:
>>> Can someone suggest some better OCR plugin for Spamassassin 3.3.1 for image spam?
>> It still seems strange to me that anybody has ever bothered with using OCR
>> to deal with image spam, when it's so easy, and for me not problematic, to
>> just block all emails that might be image spam - those with an attached
>> image that is embedded in the body of an html mail.
>>
>> Inlined attached images are not a feature that I find anywhere near worth
>> having enough to justify needing to OCR image spam.
>>
>
> Image spam was a huge deal when it first came out, and there were
> several sources scrambling to offer a solution, including resources to
> involve Bayes on the decoded text.  Those worked well enough to deter,
> for the time-being anyway, that method of spamming.
>
> That said, while I agree with your sentiment toward inline images and
> HTML mail in general, they are a common business practice and many folks
> simply can't use the outright block method.
>
> At my last job, I eventually found that image-spam dropped to such a
> significant low that I didn't need OCR anymore but was still required to
> allow inline images through.
>
> /Jason


RE: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by "Sharma, Ashish" <as...@hp.com>.
All,

The current functionality requires me to receive mails that contains image and process them.

So I want a good tool to deal with image spam.

Please suggest some.

Thanks
Ashish Sharma

-----Original Message-----
From: Jason Bertoch [mailto:jason@i6ix.com] 
Sent: Thursday, July 21, 2011 8:03 AM
To: users@spamassassin.apache.org
Subject: Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

On 7/20/2011 9:18 PM, darxus@chaosreigns.com wrote:
> On 07/20, Sharma, Ashish wrote:
>> Can someone suggest some better OCR plugin for Spamassassin 3.3.1 for image spam?
> It still seems strange to me that anybody has ever bothered with using OCR
> to deal with image spam, when it's so easy, and for me not problematic, to
> just block all emails that might be image spam - those with an attached
> image that is embedded in the body of an html mail.
>
> Inlined attached images are not a feature that I find anywhere near worth
> having enough to justify needing to OCR image spam.
>

Image spam was a huge deal when it first came out, and there were 
several sources scrambling to offer a solution, including resources to 
involve Bayes on the decoded text.  Those worked well enough to deter, 
for the time-being anyway, that method of spamming.

That said, while I agree with your sentiment toward inline images and 
HTML mail in general, they are a common business practice and many folks 
simply can't use the outright block method.

At my last job, I eventually found that image-spam dropped to such a 
significant low that I didn't need OCR anymore but was still required to 
allow inline images through.

/Jason

Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by Jason Bertoch <ja...@i6ix.com>.
On 7/20/2011 9:18 PM, darxus@chaosreigns.com wrote:
> On 07/20, Sharma, Ashish wrote:
>> Can someone suggest some better OCR plugin for Spamassassin 3.3.1 for image spam?
> It still seems strange to me that anybody has ever bothered with using OCR
> to deal with image spam, when it's so easy, and for me not problematic, to
> just block all emails that might be image spam - those with an attached
> image that is embedded in the body of an html mail.
>
> Inlined attached images are not a feature that I find anywhere near worth
> having enough to justify needing to OCR image spam.
>

Image spam was a huge deal when it first came out, and there were 
several sources scrambling to offer a solution, including resources to 
involve Bayes on the decoded text.  Those worked well enough to deter, 
for the time-being anyway, that method of spamming.

That said, while I agree with your sentiment toward inline images and 
HTML mail in general, they are a common business practice and many folks 
simply can't use the outright block method.

At my last job, I eventually found that image-spam dropped to such a 
significant low that I didn't need OCR anymore but was still required to 
allow inline images through.

/Jason

Re: Suggest OCR plugin on Spamassassin 3.3.1 for image spam

Posted by da...@chaosreigns.com.
On 07/20, Sharma, Ashish wrote:
> Can someone suggest some better OCR plugin for Spamassassin 3.3.1 for image spam?

It still seems strange to me that anybody has ever bothered with using OCR
to deal with image spam, when it's so easy, and for me not problematic, to
just block all emails that might be image spam - those with an attached
image that is embedded in the body of an html mail.

In my postfix main.cf I have:
body_checks = pcre:/etc/postfix/body_checks
And that file just contains:
/\bsrc\s*=(?:3D)?\s*["']?cid:/ REJECT Your email was rejected because you embedded an attached image in the body.

So if somebody ever sends me a legit email with an inlined attached image,
they'll still get an error, without me causing any backscatter.

My mom was annoyed that she couldn't use some tool to decorate her emails
to me with garbage, but... that doesn't qualify as a negative for me.

I've been very happily using this since 2006, and it completely made image
spam go away.

People can still send me images attached to emails, and they can still send
me emails with images embedded in the body of html emails as long as they're
hosted on a web server and not attached.  It only gets rejected if the
image is attached *and* embedded in the body of the email.

Inlined attached images are not a feature that I find anywhere near worth
having enough to justify needing to OCR image spam.

-- 
"I finally figured out the only reason to be alive is to enjoy it."
- Rita Mae Brown
http://www.ChaosReigns.com