You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Evan Platt <ev...@espphotography.com> on 2006/11/28 20:19:13 UTC

Installed FuzzyOCR - What am I missing?

Installed FuzzyOCR on my os/x box per 
http://fuzzyocr.own-hero.net/wiki/Installation-3.x  .

Based on my reading of it, I don't need to do anything other than put 
the FuzzyOcr.cf file in my spamassassin directory (which on my 
install is /private/etc/opt/mail/spamassassin/ ) .

So I have FuzzyOcr.cf, FuzzyOcr.pm (chmod +x'd) .

The relevent (AFAICT) parts of .cf are:

loadplugin FuzzyOcr FuzzyOcr.pm
body FUZZY_OCR eval:fuzzyocr_check()
describe FUZZY_OCR Mail contains an image with common spam text inside
body FUZZY_OCR_WRONG_CTYPE eval:dummy_check()
describe FUZZY_OCR_WRONG_CTYPE Mail contains an image with wrong 
content-type set
body FUZZY_OCR_CORRUPT_IMG eval:dummy_check()
describe FUZZY_OCR_CORRUPT_IMG Mail contains a corrupted image
body FUZZY_OCR_KNOWN_HASH eval:dummy_check()
describe FUZZY_OCR_KNOWN_HASH Mail contains an image with known hash

focr_personal_wordlist ./spamassassin/FuzzyOcr.words
(.words is in the same directory).

I then ran
spamassassin < animated-gif.eml > out

out shows no FuzzyOCR hits.

Am I missing something obvious?

If I'm not providing enough details, please let me know.

Thanks.

Evan


Re: Installed FuzzyOCR - What am I missing?

Posted by Chris Purves <ch...@northfolk.ca>.
Evan Platt wrote:
> At 02:56 PM 11/28/2006, you wrote:
> 
>> Last month there was a discussion thread on this list about that
>> exact topic. Search either the Apache list archives or the GMANE
>> archives. For example see:
>>
>> http://mail-archives.apache.org/mod_mbox/spamassassin-users/200610.mbox/%3cPine.HPX.4.58.0610171718530.12758@d-is00.icaen.uiowa.edu%3e 
>>
> 
> Thanks to everyone especially Decoder, I think I'm up and running.
> 
> png is the only one not working.
> 
> Any reason NOT to assign 10 points to fuzzy ocr tripped words?

The defaults are already quite high, and don't forget that more points 
are added for more words found. I think the default is one point for 
every word matched, but requiring that at least two words are found. 
Since most of the drug spams have several words, you are usually over 10 
points anyway.

> I mean I wouldn't add 10 points just because someone typed the V word in 
> an e-mail to me, but I can't think of an instance where I'd expect a GIF 
> message with it in it.

Someone might send you a copy of a comic strip about an old guy visiting 
the doctor.  You might miss out on some poor taste humour.

-- 
Chris


Re: Installed FuzzyOCR - What am I missing?

Posted by David B Funk <db...@engineering.uiowa.edu>.
On Tue, 28 Nov 2006, Evan Platt wrote:

> Thanks to everyone especially Decoder, I think I'm up and running.
>
> png is the only one not working.
>
> Any reason NOT to assign 10 points to fuzzy ocr tripped words?
>
> I mean I wouldn't add 10 points just because someone typed the V word
> in an e-mail to me, but I can't think of an instance where I'd expect
> a GIF message with it in it.

You -do- understand that the 'fuzzy' part of FuzzyOCR means that it
does inexact matching on the characters that it pulls out of an
image. So for example, a college newsletter that I received which
had a school logo image fired on FuzzyOCR claiming to match "company".

I've also seen it fire on things such as an airline ticket confirmation
notice, a religious newsletter, and a technical bulletin. Just one
word for each, which with the default score wasn't enough to tag
as spam but with a score of 10 a guaranteed FP.

Dave

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: Installed FuzzyOCR - What am I missing?

Posted by Evan Platt <ev...@espphotography.com>.
At 02:56 PM 11/28/2006, you wrote:

>Last month there was a discussion thread on this list about that
>exact topic. Search either the Apache list archives or the GMANE
>archives. For example see:
>
>http://mail-archives.apache.org/mod_mbox/spamassassin-users/200610.mbox/%3cPine.HPX.4.58.0610171718530.12758@d-is00.icaen.uiowa.edu%3e

Thanks to everyone especially Decoder, I think I'm up and running.

png is the only one not working.

Any reason NOT to assign 10 points to fuzzy ocr tripped words?

I mean I wouldn't add 10 points just because someone typed the V word 
in an e-mail to me, but I can't think of an instance where I'd expect 
a GIF message with it in it.

Yes, I run my own mail server, I'm the only user. 


Re: Installed FuzzyOCR - What am I missing?

Posted by David B Funk <db...@engineering.uiowa.edu>.
On Tue, 28 Nov 2006, Evan Platt wrote:

> Now:
> [2006-11-28 13:08:00] Unexpected error in pipe to external programs.
>                        Please check that all helper programs are
> installed and in the correct path.
>                        (Pipe Command "/sw/bin/giftopnm -", Pipe exit
> code 1 (""), Temporary file: "/tmp/.spamassassin64852a8I0tmp")
> [2006-11-28 13:08:00] Debug mode: FuzzyOcr ending successfully...
>
>
> Not having any luck googling that error (well not having any good results.
>
> # ls -al /sw/bin/giftopnm
> 24 -rwxr-xr-x 1 root admin 22536 Mar 22  2006 /sw/bin/giftopnm*

Last month there was a discussion thread on this list about that
exact topic. Search either the Apache list archives or the GMANE
archives. For example see:

http://mail-archives.apache.org/mod_mbox/spamassassin-users/200610.mbox/%3cPine.HPX.4.58.0610171718530.12758@d-is00.icaen.uiowa.edu%3e


-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: Installed FuzzyOCR - What am I missing?

Posted by Evan Platt <ev...@espphotography.com>.
At 12:53 PM 11/28/2006, you wrote:
>Did you specify a logfile? If not, do so and check for output there :)

Ahh. Nothing was being written to the log because of file permissions.

Now:
[2006-11-28 13:08:00] Unexpected error in pipe to external programs.
                       Please check that all helper programs are 
installed and in the correct path.
                       (Pipe Command "/sw/bin/giftopnm -", Pipe exit 
code 1 (""), Temporary file: "/tmp/.spamassassin64852a8I0tmp")
[2006-11-28 13:08:00] Debug mode: FuzzyOcr ending successfully...


Not having any luck googling that error (well not having any good results.

# ls -al /sw/bin/giftopnm
24 -rwxr-xr-x 1 root admin 22536 Mar 22  2006 /sw/bin/giftopnm*

Evan



Re: Installed FuzzyOCR - What am I missing?

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Evan Platt wrote:
> At 12:37 PM 11/28/2006, you wrote:
>> I forgot to tell you that you also need to increase the verbosity
>> factor of the plugin:
>>
>> focr_verbose 2
>>
>> will make sure that you see more (i.e. everything ;))
>>
>> Best regards,
>
>
> Did that, reran spamassassin -D< animated--gif.eml > out , same
> results :(
Did you specify a logfile? If not, do so and check for output there :)

Best regards,

Chris


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFbKHIJQIKXnJyDxURAqiSAJ9aRyxKzuz//TW2XCicTiiDB6nLPgCfT/uq
8XuY1ycxz3nVDPDuyDf6gBw=
=ypSP
-----END PGP SIGNATURE-----


Re: Installed FuzzyOCR - What am I missing?

Posted by Evan Platt <ev...@espphotography.com>.
At 12:37 PM 11/28/2006, you wrote:
>I forgot to tell you that you also need to increase the verbosity
>factor of the plugin:
>
>focr_verbose 2
>
>will make sure that you see more (i.e. everything ;))
>
>Best regards,


Did that, reran spamassassin -D< animated--gif.eml > out , same results :( 


Re: Installed FuzzyOCR - What am I missing?

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Evan Platt wrote:
> At 11:34 AM 11/28/2006, you wrote:
>> You should try to run spamassassin with -D to see more debug
>> output. Watch out for FuzzyOcr lines :)
>
> Didn't think of that.. :)
>
> Ok, did that.
>
> Only a few lines have Fuzzy:
I forgot to tell you that you also need to increase the verbosity
factor of the plugin:

focr_verbose 2

will make sure that you see more (i.e. everything ;))

Best regards,


Chris

>
> [554] dbg: config: read file /etc/opt/mail/spamassassin/FuzzyOcr.cf
>  [554] dbg: plugin: fixed relative path:
> /etc/opt/mail/spamassassin/FuzzyOcr.pm [554] dbg: plugin: loading
> FuzzyOcr from /etc/opt/mail/spamassassin/FuzzyOcr.pm [554] dbg:
> plugin: FuzzyOcr=HASH(0x1d0e4a4) implements 'parse_config'
>
> Nothing there looks like a problem?
>
> I put the entire debug session at
> http://www.espphotography.com/sadebug.txt
>
> Thanks.
>
> Evan

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFbJ4TJQIKXnJyDxURAv+/AJ91Hiq7q8uZWopDe1aDvkZkP+KaTACfX0kt
QF+pEYZA347kjVZBmtzLSi4=
=Geew
-----END PGP SIGNATURE-----

Help with sa-stats

Posted by John Tice <li...@johntice.com>.
I am trying to install Dallas Engelken's version of sa-stats and a  
rank novice I could use some help...
http://www.rulesemporium.com/programs/sa-stats.txt

I'm on a VPS with cpanel multiple domains. I installed this into the  
cgi-bin in my domain (not the primary server domain) and it executes  
except that it contains no data. So I moved it to the server to /root/ 
public_html/cgi-bin/ but it's not found when I point my browser at  
it. Permissions 755. I'm guessing it's not in the right place or else  
I'd at least get the results page as I do when it's in the mydomain/ 
cgi-bin location. Where should I put it? Do I need to show it the  
path to the logs, and if so where  are they located?

Also, is there a different version of this included with  
spamassassin, and how to I turn it on or access it?
Thanks-

Re: Installed FuzzyOCR - What am I missing?

Posted by Evan Platt <ev...@espphotography.com>.
At 11:34 AM 11/28/2006, you wrote:
>You should try to run spamassassin with -D to see more debug output.
>Watch out for FuzzyOcr lines :)

Didn't think of that.. :)

Ok,
did that.

Only a few lines have Fuzzy:

[554] dbg: config: read file /etc/opt/mail/spamassassin/FuzzyOcr.cf
[554] dbg: plugin: fixed relative path: /etc/opt/mail/spamassassin/FuzzyOcr.pm
[554] dbg: plugin: loading FuzzyOcr from /etc/opt/mail/spamassassin/FuzzyOcr.pm
[554] dbg: plugin: FuzzyOcr=HASH(0x1d0e4a4) implements 'parse_config'

Nothing there looks like a problem?

I put the entire debug session at
http://www.espphotography.com/sadebug.txt

Thanks.

Evan 


Re: Installed FuzzyOCR - What am I missing?

Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Evan Platt wrote:
> Installed FuzzyOCR on my os/x box per
> http://fuzzyocr.own-hero.net/wiki/Installation-3.x  .
>
> Based on my reading of it, I don't need to do anything other than
> put the FuzzyOcr.cf file in my spamassassin directory (which on my
> install is /private/etc/opt/mail/spamassassin/ ) .
>
> So I have FuzzyOcr.cf, FuzzyOcr.pm (chmod +x'd) .
>
> The relevent (AFAICT) parts of .cf are:
>
> loadplugin FuzzyOcr FuzzyOcr.pm body FUZZY_OCR
> eval:fuzzyocr_check() describe FUZZY_OCR Mail contains an image
> with common spam text inside body FUZZY_OCR_WRONG_CTYPE
> eval:dummy_check() describe FUZZY_OCR_WRONG_CTYPE Mail contains an
> image with wrong content-type set body FUZZY_OCR_CORRUPT_IMG
> eval:dummy_check() describe FUZZY_OCR_CORRUPT_IMG Mail contains a
> corrupted image body FUZZY_OCR_KNOWN_HASH eval:dummy_check()
> describe FUZZY_OCR_KNOWN_HASH Mail contains an image with known
> hash
>
> focr_personal_wordlist ./spamassassin/FuzzyOcr.words (.words is in
> the same directory).
>
> I then ran spamassassin < animated-gif.eml > out
>
> out shows no FuzzyOCR hits.
>
> Am I missing something obvious?
>
> If I'm not providing enough details, please let me know.
You should try to run spamassassin with -D to see more debug output.
Watch out for FuzzyOcr lines :)

Best regards,


Chris
>
> Thanks.
>
> Evan
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFbI8vJQIKXnJyDxURAtR2AJ90OR9yKBE2rngmCFiLn3W+8yClCQCgqUKJ
15VKwaPTeOd2sxcRU6U3qrg=
=aMj2
-----END PGP SIGNATURE-----


Re: Installed FuzzyOCR - What am I missing?

Posted by snowcrash+spamassassin <sc...@gmail.com>.
> spamassassin < animated-gif.eml > out
>
> out shows no FuzzyOCR hits.
>
> Am I missing something obvious?

when *i* first ran tests, i'd set:

     focr_autodisable_score 10

the score hit "10" too soon ... and fuzzy ocr didn't run/score any hits.

set it 'high', e.g.,

     focr_autodisable_score 999

then try again

worked for me.

hth.

Re: Installed FuzzyOCR - What am I missing?

Posted by Odhiambo Washington <wa...@wananchi.com>.
* On 28/11/06 11:19 -0800, Evan Platt wrote:
| Installed FuzzyOCR on my os/x box per 
| http://fuzzyocr.own-hero.net/wiki/Installation-3.x  .
| 
| Based on my reading of it, I don't need to do anything other than put 
| the FuzzyOcr.cf file in my spamassassin directory (which on my 
| install is /private/etc/opt/mail/spamassassin/ ) .


Really?

FuzzyOCR requires the following that you have installed the following:

gocr ImageMagick netpbm libungif gifsicle ocrad

(I don't know what libungif would be called in OS X)

perl -MCPAN -e "install String::Approx MLDBM"

| So I have FuzzyOcr.cf, FuzzyOcr.pm (chmod +x'd) .

Not necessary!

| The relevent (AFAICT) parts of .cf are:

Hmm, the FuzzyOCR that you use is from somewhere else. Have you verified
that you have all the components by reading the FuzzyOCR.cf??

Do you have all those binaries it is supposed to call?


-Wash

http://www.netmeister.org/news/learn2quote.html

DISCLAIMER: See http://www.wananchi.com/bms/terms.php

--
+======================================================================+
    |\      _,,,---,,_     | Odhiambo Washington    <wa...@wananchi.com>
Zzz /,`.-'`'    -.  ;-;;,_ | Wananchi Online Ltd.   www.wananchi.com
   |,4-  ) )-,_. ,\ (  `'-'| Tel: +254 20 313985-9  +254 20 313922
  '---''(_/--'  `-'\_)     | GSM: +254 722 743223   +254 733 744121
+======================================================================+

The shortest distance between two points is under construction.
		-- Noelie Alito