You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Mário Gamito <ga...@gmail.com> on 2007/03/10 10:17:03 UTC

FuzzyOCR gives very low scores

Hi,

I've just installed FuzzyOCR and it's really a great tool.
Awesome.

I think it just has a glitch (maybe may bad, that's why i'm asking).
It gives very low scores to the messages.

I sent this testing e-mail with this picture:
http://www.gamito.org/teste.jpg

All the words are in FuzzyOCR.words and yes, it was marked as SPAM, but 
only with a 6.4 score.

Does anyone care to share experiences ?

Warm Regards,
Mário Gamito

Re: FuzzyOCR gives very low scores

Posted by René Berber <r....@computer.org>.
Mário Gamito wrote:
[snip]
> [30747] info: rules: meta test DIGEST_MULTIPLE has undefined dependency
> 'DCC_CHECK'
> [30747] info: rules: meta test SARE_SPEC_PROLEO_M2a has dependency
> 'MIME_QP_LONG_LINE' with a zero score
> [30747] info: rules: meta test SARE_HEAD_SUBJ_RAND has undefined
> dependency 'SARE_XMAIL_SUSP2'
> [30747] info: rules: meta test SARE_HEAD_SUBJ_RAND has undefined
> dependency 'SARE_HEAD_XAUTH_WARN'
> [30747] info: rules: meta test SARE_HEAD_SUBJ_RAND has dependency
> 'X_AUTH_WARN_FAKED' with a zero score
> [30747] info: rules: meta test SARE_RD_SAFE has undefined dependency
> 'SARE_RD_SAFE_MKSHRT'
> [30747] info: rules: meta test SARE_RD_SAFE has undefined dependency
> 'SARE_RD_SAFE_GT'
> [30747] info: rules: meta test SARE_RD_SAFE has undefined dependency
> 'SARE_RD_SAFE_TINY'
> [30747] info: rules: meta test SARE_OBFU_CIALIS has undefined dependency
> 'SARE_OBFU_CIALIS2'
[snip]
> 
> What are those "undefined dependencies" ?

As you can see, most are from SARE rules, they are only warnings not a real
problem, and they are literally what they say: an undefined dependency.

For instance SARE_OBFU_CIALIS, you probably have the file 70_sare_obfu0.cf in
/etc/mail/spamassassin, then you have something different than me since I can't
find a reference to SARE_OBFU_CIALIS2, perhaps you have an old version.

Try looking into the SARE files with: grep SARE_OBFU_CIALIS2 *.cf; better yet,
try updating your "Ruled Du Jour".
-- 
René Berber


Re: FuzzyOCR gives very low scores

Posted by Mário Gamito <ga...@gmail.com>.
Hi,

Thank you for your answer.

> What are the details of that score?
> 
> If you want more detail, save your complete message for instance as test.eml,
> and run: spamassassin -x -t -D FuzzyOcr < test.eml

---------------------------------------------------------------------
[30747] info: rules: meta test DIGEST_MULTIPLE has undefined dependency 
'DCC_CHECK'
[30747] info: rules: meta test SARE_SPEC_PROLEO_M2a has dependency 
'MIME_QP_LONG_LINE' with a zero score
[30747] info: rules: meta test SARE_HEAD_SUBJ_RAND has undefined 
dependency 'SARE_XMAIL_SUSP2'
[30747] info: rules: meta test SARE_HEAD_SUBJ_RAND has undefined 
dependency 'SARE_HEAD_XAUTH_WARN'
[30747] info: rules: meta test SARE_HEAD_SUBJ_RAND has dependency 
'X_AUTH_WARN_FAKED' with a zero score
[30747] info: rules: meta test SARE_RD_SAFE has undefined dependency 
'SARE_RD_SAFE_MKSHRT'
[30747] info: rules: meta test SARE_RD_SAFE has undefined dependency 
'SARE_RD_SAFE_GT'
[30747] info: rules: meta test SARE_RD_SAFE has undefined dependency 
'SARE_RD_SAFE_TINY'
[30747] info: rules: meta test SARE_OBFU_CIALIS has undefined dependency 
'SARE_OBFU_CIALIS2'

-------------------------------------------------------------------

Content analysis details:   (3.9 points, 5.0 required)

  pts rule name              description
---------- --------------------------------------------------
-0.0 SPF_HELO_PASS          SPF: HELO matches SPF record
  1.4 SPF_NEUTRAL            SPF: sender does not match SPF record (neutral)
[SPF failed: Please see 
http://www.openspf.org/why.html?sender=gamito%40gmail.com&ip=193.136.173.2&receiver=mail.telbit.pt]
  5.0 FUZZY_OCR              BODY: Mail contains an image with common 
spam text inside
                             Words found:
                             "viagra" in 1 lines
                             "casino" in 1 lines
                             "viagra" in 1 lines
                             (3 word occurrences found)
-2.5 AWL                    AWL: From: address is in the auto white-list

-----------------------------------------------------------------------

What are those "undefined dependencies" ?

Best Regards,
Mário Gamito

Re: FuzzyOCR gives very low scores

Posted by René Berber <r....@computer.org>.
Mário Gamito wrote:

> I've just installed FuzzyOCR and it's really a great tool.
> Awesome.
> 
> I think it just has a glitch (maybe may bad, that's why i'm asking).
> It gives very low scores to the messages.
> 
> I sent this testing e-mail with this picture:
> http://www.gamito.org/teste.jpg
> 
> All the words are in FuzzyOCR.words and yes, it was marked as SPAM, but
> only with a 6.4 score.

What are the details of that score?

If you want more detail, save your complete message for instance as test.eml,
and run: spamassassin -x -t -D FuzzyOcr < test.eml

Then you can see which words were detected and how the score was added up.

Unless you changed the default FuzzyOcr configuration I doubt the score you saw
came only from FuzzyOcr, you probably have AWL and that lowered the score a lot.
-- 
René Berber


Re: FuzzyOCR gives very low scores

Posted by Mário Gamito <ga...@gmail.com>.
Hi,


Thank you for your answer.

> What does a "spamassassin --lint -D fuzzyocr <samplemessage" produce?
[root@mail cur]# spamassassin --lint -D fuzzyocr < 
1173546266.26462.mail.telbit.pt\,S\=82421\:2\,

[26671] info: rules: meta test DIGEST_MULTIPLE has undefined dependency 
'DCC_CHECK'
[26671] info: rules: meta test SARE_HEAD_SUBJ_RAND has undefined 
dependency 'SARE_XMAIL_SUSP2'
[26671] info: rules: meta test SARE_HEAD_SUBJ_RAND has undefined 
dependency 'SARE_HEAD_XAUTH_WARN'
[26671] info: rules: meta test SARE_HEAD_SUBJ_RAND has dependency 
'X_AUTH_WARN_FAKED' with a zero score
[26671] info: rules: meta test SARE_RD_SAFE has undefined dependency 
'SARE_RD_SAFE_MKSHRT'
[26671] info: rules: meta test SARE_RD_SAFE has undefined dependency 
'SARE_RD_SAFE_GT'
[26671] info: rules: meta test SARE_RD_SAFE has undefined dependency 
'SARE_RD_SAFE_TINY'
[26671] info: rules: meta test SARE_OBFU_CIALIS has undefined dependency 
'SARE_OBFU_CIALIS2'
[root@mail cur]#


Warm Regards,
Mário Gamito

RE: FuzzyOCR gives very low scores

Posted by Sietse van Zanen <si...@wizdom.nu>.
Well, start with carefully reading the documentation. It will give you better understanding.

What does a "spamassassin --lint -D fuzzyocr <samplemessage" produce?

-Sietse



From: Mário Gamito
Sent: Sat 10-Mar-07 16:18
To: Sietse van Zanen
Cc: users@spamassassin.apache.org
Subject: Re: FuzzyOCR gives very low scores


Hi,

Sietse van Zanen wrote:
> FuzzyOC does not score messages, it scores images.
>  
> If your message got a score of 6, that's probably due to the 
> auto_disable setting of FuzzyOCR.
> FuzzyOCR doesn't run when a message reaches that score. This saves 
> resources. To debug, make the auto_diable scor 100 or so.
I did.
Now it get's only 5.4 points.

I'm not sure i understand what you're telling me :(

Warm Regards,
Mário Gamito

Re: FuzzyOCR gives very low scores

Posted by Mário Gamito <ga...@gmail.com>.
Hi,

Sietse van Zanen wrote:
> FuzzyOC does not score messages, it scores images.
>  
> If your message got a score of 6, that's probably due to the 
> auto_disable setting of FuzzyOCR.
> FuzzyOCR doesn't run when a message reaches that score. This saves 
> resources. To debug, make the auto_diable scor 100 or so.
I did.
Now it get's only 5.4 points.

I'm not sure i understand what you're telling me :(

Warm Regards,
Mário Gamito

RE: FuzzyOCR gives very low scores

Posted by Sietse van Zanen <si...@wizdom.nu>.
FuzzyOC does not score messages, it scores images.

If your message got a score of 6, that's probably due to the auto_disable setting of FuzzyOCR. 
FuzzyOCR doesn't run when a message reaches that score. This saves resources. To debug, make the auto_diable scor 100 or so.

-Sietse



From: Mário Gamito
Sent: Sat 10-Mar-07 10:17
To: users@spamassassin.apache.org
Subject: FuzzyOCR gives very low scores


Hi,

I've just installed FuzzyOCR and it's really a great tool.
Awesome.

I think it just has a glitch (maybe may bad, that's why i'm asking).
It gives very low scores to the messages.

I sent this testing e-mail with this picture:
http://www.gamito.org/teste.jpg

All the words are in FuzzyOCR.words and yes, it was marked as SPAM, but 
only with a 6.4 score.

Does anyone care to share experiences ?

Warm Regards,
Mário Gamito