You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Dan Mahoney, System Admin" <da...@prime.gushi.org> on 2009/04/27 22:16:15 UTC

Re: [sa-list] Re: A rant about FUZZY_OCR

On Mon, 27 Apr 2009, Henrik K wrote:
> Nothing of this makes sense. If you don't have a test server, too bad. If
> you don't trust the "score-changing values" too bad. It all worked for me.
>
>> It's a great idea, but I'd like to see it mature some first, especially
>> with respect to its documentation, test emails, word list, and live testing.
>
> If was quickly developed to an ongoing problem. The problem disappeared
> years ago. It was mature enough for 99% of users at that time. Though it did
> add lots of complexity and stricter MTA rules etc handled the job just fine
> also.

The problem exists now, there is PNG spam, and there will continue to be, 
because it gets through.  Right now the only way I find this blocked is if 
spamcop blocks it.

Ideally, what I'd probably like to see with regard to fuzzyOCR are:

1) Just patch it enough to work with 3.2 and 3.3 -- I don't have the 
internals know-how to do this, and I don't know if Decoder still reads 
this list.

2) A debug mode, whereby the plugin would note its own score, possibly by 
applying an equal negative value.

3) Wordlists loadable from userprefs, if not bayes.

4) A recommended configuration, along with "shortcircuit" documentation.

-Dan

-- 

"Ca. Tas. Tro. Phy."

-John Smedley, March 28th 1998, 3AM

--------Dan Mahoney--------
Techie,  Sysadmin,  WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144   AIM: LarpGM
Site:  http://www.gushi.org
---------------------------


Re: [sa-list] Re: A rant about FUZZY_OCR

Posted by John Hardin <jh...@impsec.org>.
On Mon, 27 Apr 2009, Dan Mahoney, System Admin wrote:

> 3) Wordlists loadable from userprefs, if not bayes.

Along with that, the detected words should be (somehow) fed into bayes for 
analysis along with the other message text.

We touched on that last time fuzzyOCR was active.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Vista is at best mildly annoying and at worst makes you want to
   rush to Redmond, Wash. and rip somebody's liver out.      -- Forbes
-----------------------------------------------------------------------
  96 days since Obama's inauguration and still no unicorn!

Re: [sa-list] Re: A rant about FUZZY_OCR

Posted by "Dan Mahoney, System Admin" <da...@prime.gushi.org>.
On Mon, 27 Apr 2009, Jo Rhett wrote:

> On Apr 27, 2009, at 1:16 PM, Dan Mahoney, System Admin wrote:
>> The problem exists now, there is PNG spam, and there will continue to be, 
>> because it gets through.  Right now the only way I find this blocked is if 
>> spamcop blocks it.
>
>
> Just as a point of reference, I'd like to note that we haven't bothered with 
> FuzzyOCR here and absolute none of the spam which reaches my inbox is a PNG 
> or JPG or GIF spam.   SA does block it, and it does so without FuzzyOCR.
>
> That said, we have jacked the scores for e-mail with images and no text and 
> that might be why.   We never, ever receive valid e-mail with no text in it.

The spam I've been getting contains text, lots of it.  Markov-chain like 
crap that is 100 percent nonrelevant to the image.

-Dan


-- 

"She's NOT my girlfriend!"

-Dan Mahoney, Quite a bit recently.

--------Dan Mahoney--------
Techie,  Sysadmin,  WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144   AIM: LarpGM
Site:  http://www.gushi.org
---------------------------


Re: A rant about FUZZY_OCR

Posted by LuKreme <kr...@kreme.com>.
On 27-Apr-2009, at 16:06, Jo Rhett wrote:
> On Apr 27, 2009, at 1:16 PM, Dan Mahoney, System Admin wrote:
>> The problem exists now, there is PNG spam, and there will continue  
>> to be, because it gets through.  Right now the only way I find this  
>> blocked is if spamcop blocks it.
>
> Just as a point of reference, I'd like to note that we haven't  
> bothered with FuzzyOCR here and absolute none of the spam which  
> reaches my inbox is a PNG or JPG or GIF spam.   SA does block it,  
> and it does so without FuzzyOCR.

Yeah, I've not seen an image spam in my mailboxes in a long time.  I  
figured people were getting spam I'm not getting...

> We never, ever receive valid e-mail with no text in it.

Oh, I do all the time, but it's from people whom the AWL scores well  
down, pulling them out of spam range (My brother often sends me silly  
pictures with nothing else in the email).

BTW, is there anyway to see what the AWL adjustment is for a  
particular email or for a specific sender couplet?

-- 
Anybody who could duck the Vietnam war can certainly duck a couple of
shoes. -- Chris Gehlker


Re: [sa-list] Re: A rant about FUZZY_OCR

Posted by Jo Rhett <jr...@netconsonance.com>.
On Apr 27, 2009, at 1:16 PM, Dan Mahoney, System Admin wrote:
> The problem exists now, there is PNG spam, and there will continue  
> to be, because it gets through.  Right now the only way I find this  
> blocked is if spamcop blocks it.


Just as a point of reference, I'd like to note that we haven't  
bothered with FuzzyOCR here and absolute none of the spam which  
reaches my inbox is a PNG or JPG or GIF spam.   SA does block it, and  
it does so without FuzzyOCR.

That said, we have jacked the scores for e-mail with images and no  
text and that might be why.   We never, ever receive valid e-mail with  
no text in it.

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source  
and other randomness