You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Nigel Kendrick <su...@petdoctors.co.uk> on 2007/01/03 11:55:59 UTC

RE: FuzzyOCR matches word "" - FIXED!


-----Original Message-----
From: Nigel Kendrick [mailto:support-lists@petdoctors.co.uk] 
Sent: Wednesday, January 03, 2007 9:11 AM
To: users@spamassassin.apache.org
Subject: FuzzyOCR matches word ""

Hi,

I have just upgraded from FuzzOCR 2.3b to the 3.4.2 devel by copying over
the .cf and .pm files, re-making my tweaks to the .cf file and
compiling/installing gifsicle. Following a restart of spamassassin,
everything is kinda working, but the debug log shows that FuzzyOCR is
finding matches for "":

[SNIP]



OK, so I added a couple of debug lines to check the parsing of the words
list, saved the original .pm file and put mine in place and everything
checked out OK. I put back the original .pm file and everything is still
working OK.

Not planning to take this much further - just happy it's working - but will
keep an eye on it.

Hmmm




Re: "Dear Homeowner" spam

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
John Andersen wrote:
> On Thursday 11 January 2007 07:37, Daryl C. W. O'Shea wrote:
>>> One more reason to permanently blacklist geocities in SURBL IMHO.
>> Small deployments could get away with it, but if you're a large ISP
>> you'd never here the end of the complaints about it.  My WebRedirect
>> plugin takes care of geocities spam nicely though.
> 
> How does that differ from what Surbl does?

The difference is that it's not a uri blacklist.


> (I've never seen your web-redirect plug-in.  
> Do you have a webpage describing it?

http://wiki.apache.org/spamassassin/WebRedirectPlugin


Daryl

Re: "Dear Homeowner" spam

Posted by John Andersen <js...@pen.homeip.net>.
On Thursday 11 January 2007 07:37, Daryl C. W. O'Shea wrote:
> > One more reason to permanently blacklist geocities in SURBL IMHO.
>
> Small deployments could get away with it, but if you're a large ISP
> you'd never here the end of the complaints about it.  My WebRedirect
> plugin takes care of geocities spam nicely though.

How does that differ from what Surbl does?
(I've never seen your web-redirect plug-in.  
Do you have a webpage describing it?

-- 
_____________________________________
John Andersen

Re: "Dear Homeowner" spam

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
Daryl C. W. O'Shea wrote:

> Can someone forward me a copy of the spam in question as an attachment?

Nevermind, I just found one.  16.4 on this particular one.  My bayes 
rules are scored a little higher than default and I've got a few 
additional rules though:


*  0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs some
*  1.0 WEB_301 Contains a web link that returns 301
*      [URIs: http://geocities.yahoo.com.br/dalyfy59122/]
*  1.4 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail)
*      [SPF failed:
*  0.5 GEOCITIES URI: A uri contained the token 'geocities'
*  4.1 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
*      [score: 1.0000]
*  2.6 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org
*      [<http://dsbl.org/listing?218.238.201.49>]
*  0.7 RCVD_IN_NJABL_PROXY RBL: NJABL: sender is an open proxy
*      [218.238.201.49 listed in combined.njabl.org]
*  3.9 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
*      [218.238.201.49 listed in zen.spamhaus.org]
*  0.7 SARE_SPEC_XXGEOCITIE5 spamsign pointing to free webhost spam site
*  0.0 SIQ_OI_IP_UNKNOWN Query returned IP reputation unknown
*      [SIQ: score: -1 queried: bolt.com/218.238.201.49]
*  0.0 SIQ_OI_DOM_UNKNOWN Query returned domain reputation unknown
*      [SIQ: score: -1 queried: bolt.com/218.238.201.49]
*  0.0 SIQ_OI_REL_UNKNOWN Query returned relative reputation unknown
*      [SIQ: score: -1 queried: bolt.com/218.238.201.49]
*  1.5 SIQ_OI_00 Outbound Index Reputation: http://outboundindex.org/
*      [SIQ: score: 0 queried: bolt.com/218.238.201.49]

Re: "Dear Homeowner" spam

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
John Andersen wrote:
> On Tuesday 09 January 2007 06:47, Jack Gostl wrote:
>> Now that you mention it, yes, it had a Geocities URL.
>>
>> ----- Original Message -----
>> From: "John Andersen" <js...@pen.homeip.net>
>> To: <us...@spamassassin.apache.org>
>> Sent: Saturday, January 06, 2007 10:09 PM
>> Subject: Re: "Dear Homeowner" spam
> 
> One more reason to permanently blacklist geocities in SURBL IMHO.

Small deployments could get away with it, but if you're a large ISP 
you'd never here the end of the complaints about it.  My WebRedirect 
plugin takes care of geocities spam nicely though.


> Even a better reason for the Spamassassin team to find out how
> this spammer manages to consistently evade all filters.
> These spams have been slipping through for so long I'm starting
> to suspect an inside job.

After asking a number of times, I'm still not sure if I get this spam. 
The closest one I can figure is the mortgage spam I get that used to 
have "dear homeowner" in it, but now usually just has a subject of "Your 
application has been accepted" or something similar and a body paragraph 
that starts with "Bad credit OK,".

Can someone forward me a copy of the spam in question as an attachment?


Daryl

Re: "Dear Homeowner" spam

Posted by John Andersen <js...@pen.homeip.net>.
On Tuesday 09 January 2007 06:47, Jack Gostl wrote:
> Now that you mention it, yes, it had a Geocities URL.
>
> ----- Original Message -----
> From: "John Andersen" <js...@pen.homeip.net>
> To: <us...@spamassassin.apache.org>
> Sent: Saturday, January 06, 2007 10:09 PM
> Subject: Re: "Dear Homeowner" spam

One more reason to permanently blacklist geocities in SURBL IMHO.

Even a better reason for the Spamassassin team to find out how
this spammer manages to consistently evade all filters.
These spams have been slipping through for so long I'm starting
to suspect an inside job.

-- 
_____________________________________
John Andersen

Re: "Dear Homeowner" spam

Posted by Jack Gostl <go...@argoscomp.com>.
Now that you mention it, yes, it had a Geocities URL.

----- Original Message ----- 
From: "John Andersen" <js...@pen.homeip.net>
To: <us...@spamassassin.apache.org>
Sent: Saturday, January 06, 2007 10:09 PM
Subject: Re: "Dear Homeowner" spam




Re: "Dear Homeowner" spam

Posted by John Andersen <js...@pen.homeip.net>.
On Wednesday 03 January 2007 03:59, Sietse van Zanen wrote:
> Can you post (a link to) an example mesage?
>
> I am pretty sure they are caught in my setup.
>

I wonder if these are the "your credit rating doesn't matter to us"
messages.  That person is on this list and studies things very
carefully.  Those messages have been sneaking through for 
a very long time.

They almost always contain a geocities URL.  

-- 
_____________________________________
John Andersen

RE: "Dear Homeowner" spam

Posted by Sietse van Zanen <si...@wizdom.nu>.
Can you post (a link to) an example mesage?

I am pretty sure they are caught in my setup.

-Sietse



From: Jack Gostl
Sent: Wed 03-Jan-07 13:26
To: users@spamassassin.apache.org
Subject: "Dear Homeowner" spam


I've been getting a bunch of spam hawking mortgage rates. You may have seen 
it, it starts with "Dear Homeowner."   Tthe only test that flags this 
message is "BAYES_50", for all practical purposes a score of 0.

What concerns me the most is that this triggers "autolearn=ham".  I later 
feed this back through sa-learn as spam, but what I'm wondering is whether 
this undoes the damage to the Bayes databases caused by the autolearn=ham.

I'm considering lowering the autolearn threshhold to less than zero. I 
wonder if anyone else has any thoughts on this as well.

Thanks

Jack

Re: "Dear Homeowner" spam

Posted by Karl Auer <ka...@biplane.com.au>.
On Wed, 2007-01-03 at 07:26 -0500, Jack Gostl wrote:
> I'm considering lowering the autolearn threshhold to less than zero. I 
> wonder if anyone else has any thoughts on this as well.

I set the autolearn for ham to -10, so it has to be very hammy to get
learned. Seems to work well.

SA should allow autolearn to be separately turned on for ham and spam;
using a very low rating is kludgy. I'd rather be able to say explicitly
"autolearn spam, don't autolearn ham".

Regards, K.

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (kauer@biplane.com.au)                   +61-2-64957160 (h)
http://www.biplane.com.au/~kauer/                  +61-428-957160 (mob)


"Dear Homeowner" spam

Posted by Jack Gostl <go...@argoscomp.com>.
I've been getting a bunch of spam hawking mortgage rates. You may have seen 
it, it starts with "Dear Homeowner."   Tthe only test that flags this 
message is "BAYES_50", for all practical purposes a score of 0.

What concerns me the most is that this triggers "autolearn=ham".  I later 
feed this back through sa-learn as spam, but what I'm wondering is whether 
this undoes the damage to the Bayes databases caused by the autolearn=ham.

I'm considering lowering the autolearn threshhold to less than zero. I 
wonder if anyone else has any thoughts on this as well.

Thanks

Jack





RE: FuzzyOCR matches word "" -BROKE AGAIN - But reproducible

Posted by Nigel Kendrick <su...@petdoctors.co.uk>.
 
-----Original Message-----
From: Nigel Kendrick [mailto:support-lists@petdoctors.co.uk]
Sent: Wednesday, January 03, 2007 9:11 AM
To: users@spamassassin.apache.org
Subject: FuzzyOCR matches word ""

Hi,

I have just upgraded from FuzzOCR 2.3b to the 3.4.2 devel by copying over
the .cf and .pm files, re-making my tweaks to the .cf file and
compiling/installing gifsicle. Following a restart of spamassassin,
everything is kinda working, but the debug log shows that FuzzyOCR is
finding matches for "":

[SNIP]



OK, so I added a couple of debug lines to check the parsing of the words
list, saved the original .pm file and put mine in place and everything
checked out OK. I put back the original .pm file and everything is still
working OK.

Not planning to take this much further - just happy it's working - but will
keep an eye on it.

Hmmm





Broke again!

Only happens when I turn on the hash database by setting
"focr_enable_image_hashing 2"

The db files are present and world-writable (for testing).

Double Hmmm