You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Mike Pepe <la...@doki-doki.net> on 2006/08/21 19:18:38 UTC
OCR plugin doesn't seem to work
Hey guys,
Running SA 3.1.1, on Fedora Core 3, with Perl 5.8.5
I installed gocr and imagemagick packages, copied the Ocr.pm and cf
files into /etc/mail/spamassassin
The tests don't seem to run, the pump 'n dump GIFs are still arriving
and I don't see that the test is being run in the headers. Other SARE
and custom rules in that directory are running though. The permissions
are the same, etc. Anyone have any ideas?
# ls
70_sare_adult.cf 70_sare_uri1.cf spamassassin-default.rc
70_sare_obfu0.cf 99_sare_fraud_post25x.cf spamassassin-helper.sh
70_sare_obfu1.cf 99_sare_fraud_pre25x.cf spamassassin-spamc.rc
70_sare_oem.cf cathy_caparula.cf tripwire.cf
70_sare_random.cf init.pre v310.pre
70_sare_specific.cf local.cf WebRedirect.cf
70_sare_spoof.cf Ocr.cf WebRedirect.pm
70_sare_stocks.cf Ocr.pm
70_sare_uri0.cf RulesDuJour
-Mike
Re: OCR plugin doesn't seem to work
Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Mike Pepe wrote:
> decoder wrote:
>
>> Which OCR plugin are you using there? If it is the original
>> OcrPlugin, then you might try FuzzyOcr instead. The original
>> OcrPlugin was more proof-of-concept, and will cause you lots of
>> headaches with the current image spam...
>
> I did upgrade to FuzzyOCR after I read your message. But, I don't
> think it's working- however other rules seem to be catching these
> stock gifs. Here's the headers from one of them:
>
> Content analysis details: (10.6 points, 5.0 required)
>
> pts rule name description ---- ----------------------
> -------------------------------------------------- 1.1
> EXTRA_MPART_TYPE Header has extraneous Content-type:...type=
> entry 4.2 HELO_DYNAMIC_IPADDR Relay HELO'd using suspicious
> hostname (IP addr 1) 0.1 FORGED_RCVD_HELO Received: contains
> a forged HELO 1.1 HTML_IMAGE_ONLY_32 BODY: HTML: images with
> 2800-3200 bytes of words 0.4 HTML_30_40 BODY: Message
> is 30% to 40% HTML 1.0 BAYES_60 BODY: Bayesian spam
> probability is 60 to 80% [score: 0.7765] 0.0 HTML_MESSAGE
> BODY: HTML included in message 0.8 SARE_GIF_ATTACH FULL:
> Email has a inline gif 2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent
> directly from dynamic IP address [71.197.31.248 listed in
> dnsbl.sorbs.net]
>
> I don't see OCR mentioned in there at all. I still don't think it's
> working.
>
> Spamassassin --lint doesn't indicate anything is wrong. How can I
> test it?
>
> -Mike
>
The download page of FuzzyOcr provides a sample-mails.tar.gz. It
contains some messages which should all get detected.
Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFE7BEyJQIKXnJyDxURAv18AKCg6TCSrH41ERtalz/H93/sqlsjXACdF5ue
FfD4tGxRS5cEWQ8of2aT/Co=
=xyHr
-----END PGP SIGNATURE-----
Re: OCR plugin doesn't seem to work
Posted by Mike Pepe <la...@doki-doki.net>.
decoder wrote:
> Which OCR plugin are you using there? If it is the original OcrPlugin,
> then you might try FuzzyOcr instead. The original OcrPlugin was more
> proof-of-concept, and will cause you lots of headaches with the
> current image spam...
I did upgrade to FuzzyOCR after I read your message. But, I don't think
it's working- however other rules seem to be catching these stock gifs.
Here's the headers from one of them:
Content analysis details: (10.6 points, 5.0 required)
pts rule name description
---- ----------------------
--------------------------------------------------
1.1 EXTRA_MPART_TYPE Header has extraneous Content-type:...type=
entry
4.2 HELO_DYNAMIC_IPADDR Relay HELO'd using suspicious hostname (IP addr
1)
0.1 FORGED_RCVD_HELO Received: contains a forged HELO
1.1 HTML_IMAGE_ONLY_32 BODY: HTML: images with 2800-3200 bytes of
words
0.4 HTML_30_40 BODY: Message is 30% to 40% HTML
1.0 BAYES_60 BODY: Bayesian spam probability is 60 to 80%
[score: 0.7765]
0.0 HTML_MESSAGE BODY: HTML included in message
0.8 SARE_GIF_ATTACH FULL: Email has a inline gif
2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP
address
[71.197.31.248 listed in dnsbl.sorbs.net]
I don't see OCR mentioned in there at all. I still don't think it's working.
Spamassassin --lint doesn't indicate anything is wrong. How can I test it?
-Mike
Re: OCR plugin doesn't seem to work
Posted by decoder <de...@own-hero.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Mike Pepe wrote:
> Hey guys,
>
> Running SA 3.1.1, on Fedora Core 3, with Perl 5.8.5
>
> I installed gocr and imagemagick packages, copied the Ocr.pm and cf
> files into /etc/mail/spamassassin
>
> The tests don't seem to run, the pump 'n dump GIFs are still
> arriving and I don't see that the test is being run in the headers.
> Other SARE and custom rules in that directory are running though.
> The permissions are the same, etc. Anyone have any ideas?
>
> # ls 70_sare_adult.cf 70_sare_uri1.cf
> spamassassin-default.rc 70_sare_obfu0.cf
> 99_sare_fraud_post25x.cf spamassassin-helper.sh 70_sare_obfu1.cf
> 99_sare_fraud_pre25x.cf spamassassin-spamc.rc 70_sare_oem.cf
> cathy_caparula.cf tripwire.cf 70_sare_random.cf init.pre
> v310.pre 70_sare_specific.cf local.cf
> WebRedirect.cf 70_sare_spoof.cf Ocr.cf
> WebRedirect.pm 70_sare_stocks.cf Ocr.pm 70_sare_uri0.cf
> RulesDuJour
>
Which OCR plugin are you using there? If it is the original OcrPlugin,
then you might try FuzzyOcr instead. The original OcrPlugin was more
proof-of-concept, and will cause you lots of headaches with the
current image spam...
Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFE6e/4JQIKXnJyDxURAlpSAJwInsGumasFgOK0ZOGp6M5W5Atw1ACeMqpx
QKBndV7iGnXOuxQJVip/ox4=
=GpHQ
-----END PGP SIGNATURE-----