You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by David Baron <d_...@012.net.il> on 2007/01/08 15:08:07 UTC
FuzzyOcr -- how do I know it is working?
Installed the Debian package. How do I know it is working? Are all those
"SPAMMY" rules its?
Re: FuzzyOcr -- how do I know it is working?
Posted by David Baron <d_...@012.net.il>.
On Monday 08 January 2007 18:34, Gary V wrote:
> >Installed the Debian package. How do I know it is working? Are all those
> >"SPAMMY" rules its?
>
> I looks like you are using amavisd-new. SPAMMY essentially means a message
> scored between tag2_level and kill_level and is not directly related to
> FuzzyOcr. If you get FuzzyOcr hits you will see FUZZY_OCR rules hit.
> Remember that messages that score over focr_autodisable_score (I think 10
> is the default) will not get scanned. Also remember that you will need to
> reload amavisd-new after each change to FuzzyOcr.cf if you want to see the
> changes.
I am not using amavisd.
I did see some STOCK_IMAGE type hits.
Many of the messages were > 10 so would not have been scanned.
>
> In FuzzyOcr.cf I suggest TEMPORARILY increasing focr_verbose:
> focr_verbose 2
>
> Personally I set it back to 0 when not debugging.
>
> and enabling the log (it must be writable by the user running SA):
> focr_logfile /var/lib/amavis/FuzzyOcr.log
OK, I will set a log file in a normal place.
I have both gocr and ocrad installed.
RE: FuzzyOcr -- how do I know it is working?
Posted by Gary V <mr...@hotmail.com>.
>Installed the Debian package. How do I know it is working? Are all those
>"SPAMMY" rules its?
I looks like you are using amavisd-new. SPAMMY essentially means a message
scored between tag2_level and kill_level and is not directly related to
FuzzyOcr. If you get FuzzyOcr hits you will see FUZZY_OCR rules hit.
Remember that messages that score over focr_autodisable_score (I think 10 is
the default) will not get scanned. Also remember that you will need to
reload amavisd-new after each change to FuzzyOcr.cf if you want to see the
changes.
In FuzzyOcr.cf I suggest TEMPORARILY increasing focr_verbose:
focr_verbose 2
Personally I set it back to 0 when not debugging.
and enabling the log (it must be writable by the user running SA):
focr_logfile /var/lib/amavis/FuzzyOcr.log
placing it in the user's home directory like this should be enough, but just
in case:
touch /var/lib/amavis/FuzzyOcr.log
chown amavis:amavis /var/lib/amavis/FuzzyOcr.log
(or)
focr_logfile /tmp/FuzzyOcr.log
and:
chown amavis:amavis /tmp/FuzzyOcr.log
(but be careful not to fill up the /tmp directory)
Then (for example) tail -f /var/lib/amavis/FuzzyOcr.log and send a message
through with an image (preferably a stock scam or pharmacy spam image with
legible text).
Note that on a Debian system you will get errors because Debian is using an
older version of netpbm which does not contain all the utilities FuzzyOcr
expects to find. I personally comment out the preprocessors and scansets
that use the missing components. Whether I'm doing it correctly is another
matter.
Regarding The Debian package, on the devel-spam mailing list, I posted:
#################
I think it would be good if the package installed both ocrad and gocr
since ocrad does a good job.
Also, you should possibly comment out the preprocessors (and scansets
that use them) that are not included with the currently available
Debian netpbm package. This would prevent errors like:
Cannot find executable for ocrad
Cannot find executable for pamthreshold
Cannot find executable for pamtopnm
Cannot find executable for tesseract
Skipping ocrad, invalid command '$ocrad'
Skipping ocrad-invert, invalid command '$ocrad'
Skipping ocrad-decolorize-invert, invalid command '$ocrad'
Skipping ocrad-decolorize, invalid command '$ocrad'
and with ocrad installed:
Cannot find executable for pamthreshold
Cannot find executable for pamtopnm
Cannot find executable for tesseract
Error running preprocessor(pamthreshold): pamthreshold -simple -threshold
0.5
Errors in Scanset "ocrad-decolorize-invert"
Return code: 2048, Error: save_execute: failed to exec pamthreshold -simple
-threshold 0.5: No such file or directory at
/usr/share/perl5/FuzzyOcr/Misc.pm line 173.
Skipping scanset because of errors, trying next...
Error running preprocessor(pamthreshold): pamthreshold -simple -threshold
0.5
Errors in Scanset "ocrad-decolorize"
Return code: 2048, Error: save_execute: failed to exec pamthreshold -simple
-threshold 0.5: No such file or directory at
/usr/share/perl5/FuzzyOcr/Misc.pm line 173.
Skipping scanset because of errors, trying next...
Possibly (but I'm not certain):
--- FuzzyOcr.cf-original 2007-01-07 16:27:17.093798195 -0700
+++ FuzzyOcr.cf 2007-01-07 16:29:12.319402455 -0700
@@ -99,8 +99,9 @@
# Include additional scanner/preprocessor commands here:
#
-focr_bin_helper pnmnorm, pnminvert, pamthreshold, ppmtopgm, pamtopnm
-focr_bin_helper tesseract
+#focr_bin_helper pnmnorm, pnminvert, pamthreshold, ppmtopgm, pamtopnm
+#focr_bin_helper tesseract
+focr_bin_helper pnmnorm, pnminvert, ppmtopgm
--- FuzzyOcr.scansets-original 2007-01-07 16:27:53.607240168 -0700
+++ FuzzyOcr.scansets 2007-01-07 16:29:58.825582474 -0700
@@ -18,19 +18,19 @@
args = -s5 -i $input
}
-# Inverted Ocrad scanset with decolorization
-scanset ocrad-decolorize-invert {
- preprocessors = ppmtopgm, pamthreshold, pamtopnm
- command = $ocrad
- args = -s5 -i $input
-}
+## Inverted Ocrad scanset with decolorization
+#scanset ocrad-decolorize-invert {
+# preprocessors = ppmtopgm, pamthreshold, pamtopnm
+# command = $ocrad
+# args = -s5 -i $input
+#}
-# Ocrad scanset with decolorization
-scanset ocrad-decolorize {
- preprocessors = ppmtopgm, pamthreshold, pamtopnm
- command = $ocrad
- args = -s5 $input
-}
+## Ocrad scanset with decolorization
+#scanset ocrad-decolorize {
+# preprocessors = ppmtopgm, pamthreshold, pamtopnm
+# command = $ocrad
+# args = -s5 $input
+#}
--- FuzzyOcr.preps-original 2007-01-07 16:27:39.158044309 -0700
+++ FuzzyOcr.preps 2007-01-07 16:30:51.907932931 -0700
@@ -16,16 +16,16 @@
command = ppmtopgm
}
-# Converts PAM to PNM
-preprocessor pamtopnm {
- command = pamtopnm
-}
+## Converts PAM to PNM
+#preprocessor pamtopnm {
+# command = pamtopnm
+#}
-# Uses thresholding on the PAM file
-preprocessor pamthreshold {
- command = pamthreshold
- args = -simple -threshold 0.5
-}
+## Uses thresholding on the PAM file
+#preprocessor pamthreshold {
+# command = pamthreshold
+# args = -simple -threshold 0.5
+#}
###################
Gary V
_________________________________________________________________
Fixing up the home? Live Search can help
http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmemailtaglinenov06&FORM=WLMTAG