You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Turbo Fredriksson <tu...@debian.org> on 2006/10/18 23:18:18 UTC

[lessons come] Bug#63460: defining gender context.

These kind of spam have been getting through for quite some time
now, but now they're really starting to bug me!

I'm running SpamAssassin (spamd+spamc) version 3.1.3.

This is the mail that is in my mailbox, so it's already
processed. Running it again with spamc (still) gives me
0.3 points (same as in the mail):

----- s n i p -----
Content analysis details:   (0.3 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 1.1 EXTRA_MPART_TYPE       Header has extraneous Content-type:...type= entry
 0.0 UNPARSEABLE_RELAY      Informational: message has unparseable relay lines
 1.8 HTML_IMAGE_ONLY_24     BODY: HTML: images with 2000-2400 bytes of words
 0.0 HTML_MESSAGE           BODY: HTML included in message
-2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
                            [score: 0.0000]
----- s n i p -----


First obvious question is WHY this isn't catched and the
second is WHAT I can do about it... Any idea(s)?


Re: [lessons come] Bug#63460: defining gender context.

Posted by Andy Jezierski <aj...@stepan.com>.
Jeroen Tebbens <je...@tebbens.net> wrote on 10/18/2006 04:27:54 PM:

> Theo Van Dinter wrote:
> > On Wed, Oct 18, 2006 at 11:18:18PM +0200, Turbo Fredriksson wrote:
> > 
> >> These kind of spam have been getting through for quite some time
> >> now, but now they're really starting to bug me!
> >> 
> >
> > 
> >> -2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 
1%
> >>                             [score: 0.0000]
> >> 
> >
> > Learn the mail as spam, -2.6 isn't helping.
> > Second, you don't have the rules from sa-update so do that. :)
> >
> > That message scores a 7.9 for me w/ no BAYES rules, FWIW.
> >
> > 
> 
> With imageinfo and FuzzyOcr (using 2.3rc1 atm) you can get it to 33 
even. :)
> 
> Content analysis details:   (33.0 points, 7.0 required)
> 
>  pts rule name              description
> ---- ---------------------- 
> --------------------------------------------------
>  0.8 MY_DSL                 I could use a BL for this.
>  1.1 EXTRA_MPART_TYPE       Header has extraneous Content-type:...type= 
> entry
>  0.0 UNPARSEABLE_RELAY      Informational: message has unparseable relay 

> lines
>  2.8 TVD_FW_GRAPHIC_ID1     BODY: TVD_FW_GRAPHIC_ID1
>  1.8 HTML_IMAGE_ONLY_24     BODY: HTML: images with 2000-2400 bytes of 
words
>  1.0 HTML_MESSAGE           BODY: HTML included in message
>  0.0 BAYES_50               BODY: Bayesian spam probability is 40 to 60%
>                             [score: 0.5000]
>  1.0 INLINE_IMAGE           RAW: Inline Images
>  2.5 SARE_GIF_ATTACH        FULL: Email has a inline gif
>  0.5 VOWEL_FROM_5           Impronouncable from header (6 consecutive 
> vowels)
>  0.9 FM_NO_STYLE            FM_NO_STYLE
>  2.5 SARE_GIF_STOX          Inline Gif with little HTML
>   18 FUZZY_OCR              BODY: Mail contains an image with common 
> spam text inside
>                             Words found:
>                             "alert" in 1 lines
>                             "news" in 2 lines
>                             "alert" in 1 lines
>                             "stock" in 2 lines
>                             "investor" in 1 lines
>                             "company" in 1 lines
>                             "price" in 1 lines
>                             "trade" in 1 lines
>                             "banking" in 1 lines
>                             "litl" in 2 lines
>                             "meridia" in 1 lines
>                             "penis" in 1 lines
>                             "kunde" in 1 lines
>                             (16 word occurrences found)
> 


Hmmm.... I think I may need to update my FuzzyOCR. I didn't get any 
fuzzyocr hits at all, but still managed an 8.6

Andy

Re: [lessons come] Bug#63460: defining gender context.

Posted by Jeroen Tebbens <je...@tebbens.net>.
Theo Van Dinter wrote:
> On Wed, Oct 18, 2006 at 11:18:18PM +0200, Turbo Fredriksson wrote:
>   
>> These kind of spam have been getting through for quite some time
>> now, but now they're really starting to bug me!
>>     
>
>   
>> -2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
>>                             [score: 0.0000]
>>     
>
> Learn the mail as spam, -2.6 isn't helping.
> Second, you don't have the rules from sa-update so do that. :)
>
> That message scores a 7.9 for me w/ no BAYES rules, FWIW.
>
>   

With imageinfo and FuzzyOcr (using 2.3rc1 atm) you can get it to 33 even. :)

Content analysis details:   (33.0 points, 7.0 required)

 pts rule name              description
---- ---------------------- 
--------------------------------------------------
 0.8 MY_DSL                 I could use a BL for this.
 1.1 EXTRA_MPART_TYPE       Header has extraneous Content-type:...type= 
entry
 0.0 UNPARSEABLE_RELAY      Informational: message has unparseable relay 
lines
 2.8 TVD_FW_GRAPHIC_ID1     BODY: TVD_FW_GRAPHIC_ID1
 1.8 HTML_IMAGE_ONLY_24     BODY: HTML: images with 2000-2400 bytes of words
 1.0 HTML_MESSAGE           BODY: HTML included in message
 0.0 BAYES_50               BODY: Bayesian spam probability is 40 to 60%
                            [score: 0.5000]
 1.0 INLINE_IMAGE           RAW: Inline Images
 2.5 SARE_GIF_ATTACH        FULL: Email has a inline gif
 0.5 VOWEL_FROM_5           Impronouncable from header (6 consecutive 
vowels)
 0.9 FM_NO_STYLE            FM_NO_STYLE
 2.5 SARE_GIF_STOX          Inline Gif with little HTML
  18 FUZZY_OCR              BODY: Mail contains an image with common 
spam text inside
                            Words found:
                            "alert" in 1 lines
                            "news" in 2 lines
                            "alert" in 1 lines
                            "stock" in 2 lines
                            "investor" in 1 lines
                            "company" in 1 lines
                            "price" in 1 lines
                            "trade" in 1 lines
                            "banking" in 1 lines
                            "litl" in 2 lines
                            "meridia" in 1 lines
                            "penis" in 1 lines
                            "kunde" in 1 lines
                            (16 word occurrences found)


Re: [lessons come] Bug#63460: defining gender context.

Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Oct 18, 2006 at 11:18:18PM +0200, Turbo Fredriksson wrote:
> These kind of spam have been getting through for quite some time
> now, but now they're really starting to bug me!

> -2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
>                             [score: 0.0000]

Learn the mail as spam, -2.6 isn't helping.
Second, you don't have the rules from sa-update so do that. :)

That message scores a 7.9 for me w/ no BAYES rules, FWIW.

-- 
Randomly Selected Tagline:
I used to spell badlie, but now I got worser.