You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2006/09/05 11:02:14 UTC

Re: We're working on OCR?

Theo Van Dinter writes:
> From /.:
> 
> JohnGrahamCumming writes "Everyone's noticed the recent flood of image
> spam (including the SpamAssassin developers who are working on an
> OCR-extension to beat it) ...

I saw that! 

I think jgc got the wrong end of the stick, and misunderstood a comment
I left on his blog at:
http://www.jgc.org/blog/2006/09/subliminal-advertising-in-spam.html .

I said: 

  'FWIW, this evolution has been happening in lockstep with a SpamAssassin
  OCR plugin's development; I think it's one spammer, responding to the new
  filter.

  It's a bit of a silly response, really -- now the presence of the
  animation bits in a GIF file header means 100% spam. ;)'


in other words, referring to the plugin that "decoder" has been developing
over on the SpamAssassin users list.  I'll bet jgc misparsed that as
meaning that we (the SpamAssassin core team) were developing it.

Sorry about that, decoder! ;)

> I also noticed that there's a new OCR package that's been released as open
> source:
> 
> http://google-code-updates.blogspot.com/2006/08/announcing-tesseract-ocr.html
> 
> If we do get into it, we should test different OCR systems.

Personally, I don't think we need to get into it -- in my opinion, it's
going pretty well as a third-party plugin...

--j.

Re: We're working on OCR?

Posted by John Graham-Cumming <jg...@jgc.org>.
Justin Mason wrote:
> in other words, referring to the plugin that "decoder" has been developing
> over on the SpamAssassin users list.  I'll bet jgc misparsed that as
> meaning that we (the SpamAssassin core team) were developing it.

Yep.  I misunderstood.  Sorry.

John.