You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Dan <ka...@gmail.com> on 2006/01/30 17:48:17 UTC
Detecting phishing urls
What is the best way to detect urls that have one address in the <a>
tag's href, but another in its body like the following example:
<a href="http://123.123.123.123/fraud_uri">http://amazon.com/official_looking_path</a>
I can write a regexp that looks for an address in the <a> tag's body
that is different than in it's href, but I figured someone else had
already written one. Can someone point me in the right direction?
-Dan
Re: Detecting phishing urls
Posted by Loren Wilton <lw...@earthlink.net>.
> I can write a regexp that looks for an address in the <a> tag's body
> that is different than in it's href, but I figured someone else had
> already written one. Can someone point me in the right direction?
I believe there are one or two in sare_fraud, but they aren't scored very
high because they can FP a lot on legit stuff.
Loren
Re: Post your top 10 from sa-stats
Posted by Dallas Engelken <da...@uribl.com>.
On Mon, 2006-01-30 at 16:45 -0600, qqqq wrote:
> Here is mine:
>
> TOP SPAM RULES FIRED
> ------------------------------------------------------------
> RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM
> %OFHAM
> ------------------------------------------------------------
> 1 URIBL_BLACK 257778 7.36 44.54 77.31
amen to that!
--
Dallas Engelken <da...@uribl.com>
http://uribl.com
Re: Post your top 10 from sa-stats
Posted by Patrick Sneyers <cg...@bulckens.com>.
Email: 1556 Autolearn: 679 AvgScore: 3.66 AvgScanTime:
4.27 sec
Spam: 480 Autolearn: 148 AvgScore: 14.65 AvgScanTime:
3.71 sec
Ham: 1076 Autolearn: 531 AvgScore: -1.24 AvgScanTime:
4.52 sec
Time Spent Running SA: 1.84 hours
Time Spent Processing Spam: 0.50 hours
Time Spent Processing Ham: 1.35 hours
TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
1 RAZOR2_CHECK 398 27.76 82.92 3.16
2 RAZOR2_CF_RANGE_51_100 388 27.06 80.83 3.07
3 BAYES_99 381 24.74 79.38 0.37
4 HTML_MESSAGE 330 41.26 68.75 29.00
5 RAZOR2_CF_RANGE_E8_51_100 310 21.08 64.58 1.67
6 RAZOR2_CF_RANGE_E4_51_100 222 15.30 46.25 1.49
7 MIME_HTML_ONLY 170 15.17 35.42 6.13
8 MSGID_FROM_MTA_ID 152 11.38 31.67 2.32
9 FORGED_HOTMAIL_RCVD2 96 6.17 20.00 0.00
10 HTML_MIME_NO_HTML_TAG 95 7.01 19.79 1.30
11 DRUGS_ERECTILE 91 5.85 18.96 0.00
12 MIME_HEADER_CTYPE_ONLY 91 5.91 18.96 0.09
13 HTML_30_40 86 6.56 17.92 1.49
14 REPTO_OVERQUOTE_THEBAT 82 5.33 17.08 0.09
15 INFO_TLD 72 4.88 15.00 0.37
16 UPPERCASE_25_50 66 4.69 13.75 0.65
17 SARE_ADULT2 65 4.18 13.54 0.00
18 DRUG_ED_CAPS 60 3.86 12.50 0.00
19 UNPARSEABLE_RELAY 58 16.90 12.08 19.05
20 SARE_SUPERVIAGRA 55 3.53 11.46 0.00
----------------------------------------------------------------------
TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
1 BAYES_00 1023 66.00 0.83 95.07
2 AWL 736 47.88 1.88 68.40
3 HTML_MESSAGE 312 41.26 68.75 29.00
4 NO_REAL_NAME 247 17.61 5.62 22.96
5 UNPARSEABLE_RELAY 205 16.90 12.08 19.05
6 NO_RELAYS 116 7.46 0.00 10.78
7 TW_IJ 112 7.20 0.00 10.41
8 ADDRESS_IN_SUBJECT 105 7.39 2.08 9.76
9 FORGED_RCVD_HELO 86 6.75 3.96 7.99
10 MIME_HTML_ONLY 66 15.17 35.42 6.13
11 HTML_90_100 52 5.46 6.88 4.83
12 TW_JK 41 2.63 0.00 3.81
13 RAZOR2_CHECK 34 27.76 82.92 3.16
14 RAZOR2_CF_RANGE_51_100 33 27.06 80.83 3.07
15 EXTRA_MPART_TYPE 27 4.24 8.12 2.51
16 TW_JG 27 1.74 0.00 2.51
17 TW_JF 25 1.61 0.00 2.32
18 MSGID_FROM_MTA_ID 25 11.38 31.67 2.32
19 HTML_50_60 24 4.69 10.21 2.23
20 BAYES_50 24 4.82 10.62 2.23
----------------------------------------------------------------------
Re: Post your top 10 from sa-stats
Posted by Mike Jackson <mj...@barking-dog.net>.
I use the other sa-stats script, which I modified to show stats on the
rules:
Top spam rules: Ham: Spam: % Ham: % Spam:
----------------------------------------------------------------------
RAZOR2_CHECK 90 1098 4.32 68.33
RAZOR2_CF_RANGE_51_100 94 1085 4.52 67.52
BAYES_99 0 961 0.00 59.80
RAZOR2_CF_RANGE_E8_51_100 70 861 3.36 53.58
HTML_MESSAGE 365 760 17.54 47.29
URIBL_JP_SURBL 59 757 2.84 47.11
RAZOR2_CF_RANGE_E4_51_100 65 622 3.12 38.71
URIBL_OB_SURBL 43 616 2.07 38.33
PYZOR_CHECK 75 611 3.60 38.02
URIBL_WS_SURBL 45 592 2.16 36.84
Top ham rules: Ham: Spam: % Ham: % Spam:
----------------------------------------------------------------------
BAYES_00 1222 13 58.72 0.81
AWL 1063 366 51.08 22.78
ALL_TRUSTED 763 14 36.67 0.87
NO_REAL_NAME 733 284 35.22 17.67
HTML_MESSAGE 365 760 17.54 47.29
SPF_PASS 265 187 12.73 11.64
USER_IN_WHITELIST 172 0 8.27 0.00
FORGED_RCVD_HELO 151 158 7.26 9.83
SUBJ_HAS_UNIQ_ID 143 11 6.87 0.68
USER_IN_DEF_SPF_WL 140 0 6.73 0.00
Re: Post your top 10 from sa-stats
Posted by Matt Kettler <mk...@evi-inc.com>.
jdow wrote:
> From: "Dallas Engelken" <da...@uribl.com>
>
>> On Tue, 2006-01-31 at 07:37 -0600, DAve wrote:
>>> And mine, note that these are *post* MailScanner and RBLs, which are
>>> running on my mail gateways. By the time SA gets the mail I've pruned
>>> anywhere from 45% to 75% of the messages, depending on the day.
>>>
>>> TOP SPAM RULES FIRED
>>> RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM
>>> 1 URIBL_BLACK 162360 8.88 55.25 88.86 2.10
>>
>> is that 2% ham hits really missed spam or are you having false positives
>> due to URIBL_BLACK??
>
> I am inclined to think there are a very few false positives, one every
> couple thousand or so. Spam that manages to sail through, here, do not
> seem to get marked with any BL rules as a general rule. That is why it
> scores 3.0 rather than a higher number. {^_-}
I personally have a higher-than 1 in every 500 FP rate from URIBL_BLACK.
# grep URIBL_BLACK maillog |wc -l
3992
# grep URIBL_BLACK maillog |grep BSP_TRUSTED |wc -l
9
Most of those come from hits against emails sent by ediets.com's subscriber
services. While this site is heavily ad laden, it is a subscriber service.
Re: Post your top 10 from sa-stats
Posted by jdow <jd...@earthlink.net>.
From: "Dallas Engelken" <da...@uribl.com>
> On Tue, 2006-01-31 at 07:37 -0600, DAve wrote:
>> And mine, note that these are *post* MailScanner and RBLs, which are
>> running on my mail gateways. By the time SA gets the mail I've pruned
>> anywhere from 45% to 75% of the messages, depending on the day.
>>
>> TOP SPAM RULES FIRED
>> RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM
>> 1 URIBL_BLACK 162360 8.88 55.25 88.86 2.10
>
> is that 2% ham hits really missed spam or are you having false positives
> due to URIBL_BLACK??
I am inclined to think there are a very few false positives, one every
couple thousand or so. Spam that manages to sail through, here, do not
seem to get marked with any BL rules as a general rule. That is why it
scores 3.0 rather than a higher number. {^_-}
{^_^}
Re: Post your top 10 from sa-stats
Posted by Dallas Engelken <da...@uribl.com>.
On Tue, 2006-01-31 at 07:37 -0600, DAve wrote:
> And mine, note that these are *post* MailScanner and RBLs, which are
> running on my mail gateways. By the time SA gets the mail I've pruned
> anywhere from 45% to 75% of the messages, depending on the day.
>
> TOP SPAM RULES FIRED
> RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM
> 1 URIBL_BLACK 162360 8.88 55.25 88.86 2.10
is that 2% ham hits really missed spam or are you having false positives
due to URIBL_BLACK??
Thanks,
--
Dallas Engelken <da...@uribl.com>
http://uribl.com
Re: Post your top 10 from sa-stats
Posted by jdow <jd...@earthlink.net>.
From: "Andy Jezierski" <aj...@stepan.com>
> Here's mine after making it through the RBL lists & Greylisting:
Congratulations on the staggeringly poor Bayes training you have. I've
not seen it reported worse short of utter failure.
{O.O}
Re: Post your top 10 from sa-stats
Posted by Gene Heskett <ge...@verizon.net>.
On Friday 03 February 2006 00:30, jdow wrote:
>From: "John Fleming" <jo...@wa9als.com>
>
>>>>>Wrong tool. Visit http://www.rulesemporium.com/ and find the
>>>>>sa-stats.pl on their site. It is the one most of us are using. It
>>>>>gives individual score breakdowns. The name coincidence is
>>>>>regrettable.
>>
>> I have the "other sa-stats.pl" working well on my system. But I'm
>> apparently not pointing the "other" version from RE to the log file
>> correctly, as the results are all zero.
>>
>> Major perl inexperience here - Would someone pleez send me their
>> config lines for the RE version?
>
>Generally an sa-stats.pl will tell you what parameters are. The
> defaults for the sa-stats.pl that come with SpamAssassin Tools are
> useless. I went in and edited them. I made "end" be "today" and
> "start" be "yesterday." It's maybe not ideal. But it functions as
> well as that puppy ever functions.
>
>The SARE version works fine as long as it can figure out where the
>mail log lives.
>
And if you can find it on SARE, it was invisible when I looked last
night.
>{^_^}
--
Cheers, Gene
People having trouble with vz bouncing email to me should add the word
'online' between the 'verizon', and the dot which bypasses vz's
stupid bounce rules. I do use spamassassin too. :-)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.
Re: Post your top 10 from sa-stats
Posted by Chris Purves <ch...@northfolk.ca>.
On Friday 03 February 2006 21:58, John Fleming wrote:
> >
> > Using the latest file from rules emporium, I made the file execuatable,
> > then:
> >
> > ./sa-stats-1.0.txt -l /var/log/spamassassin/ -f spamd.log
> >
> > For help:
> > ./sa-stats-1.0.txt -h
>
> Thanks for your response! I am running 3.0.3 on Debian Sarge (stable).
> The logs I have to use are /var/log/mail.log or /var/log/syslog. Using the
> "other" sa-stats.pl (that works fine), I use the log mail.log.
>
> However, when I run sa-stats.txt, I everything is empty. It must not be
> getting the right log?? THANKS! -John
>
> # perl ./sa-stats-1.0.txt -l /var/log/syslog.log
>
Try:
# perl ./sa-stats-1.0.txt -l /var/log/ -f syslog.log
I found that you need to specify both the directory and the log file
separately. But then you can read in several files at once.
--
Good day, eh.
Chris
Re: Post your top 10 from sa-stats
Posted by John Fleming <jo...@wa9als.com>.
----- Original Message -----
From: "Chris Purves" <ch...@northfolk.ca>
To: <us...@spamassassin.apache.org>
Sent: Thursday, February 02, 2006 9:00 PM
Subject: Re: Post your top 10 from sa-stats
> John Fleming wrote:
>>>>> Wrong tool. Visit http://www.rulesemporium.com/ and find the
>>>>> sa-stats.pl on their site. It is the one most of us are using. It
>>>>> gives individual score breakdowns. The name coincidence is
>>>>> regrettable.
>>
>>
>> I have the "other sa-stats.pl" working well on my system. But I'm
>> apparently not pointing the "other" version from RE to the log file
>> correctly, as the results are all zero.
>>
>> Major perl inexperience here - Would someone pleez send me their config
>> lines for the RE version?
>>
> Using the latest file from rules emporium, I made the file execuatable,
> then:
>
> ./sa-stats-1.0.txt -l /var/log/spamassassin/ -f spamd.log
>
> For help:
> ./sa-stats-1.0.txt -h
Thanks for your response! I am running 3.0.3 on Debian Sarge (stable). The
logs I have to use are /var/log/mail.log or /var/log/syslog. Using the
"other" sa-stats.pl (that works fine), I use the log mail.log. Both
mail.log and syslog have entries like:
Jan 31 06:52:09 Luke spamd[8429]: identified spam (129.9/5.0) for john:1000
in 11.3 seconds, 3146 bytes.
Jan 31 06:52:09 Luke spamd[8429]: result: Y 129 -
BAYES_99,DNS_FROM_AHBL_RHSBL,DRUGS_ERECTILE,DRUG_DOSAGE,DRUG_ED_CAPS,HTML_FONT_BIG,HTML_FONT_SIZE_LARGE,HTML_MESSAGE,HTML_SHOUTING5,INVALID_DATE,RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_BL_SPAMCOP_NET,SARE_SUPERVIAGRA,UPPERCASE_25_50,URIBL_AB_SURBL,URIBL_BLACK,URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL,USER_IN_BLACKLIST,WLS_URI_OPT_1406
scantime=11.3,size=3146,mid=<76...@00inkjets.com>,bayes=1,autolearn=spam
Jan 31 06:52:09 Luke postfix/local[10453]: F3FF433E703:
to=<jo...@wa9als.com>, relay=local, delay=12, status=sent (delivered to
command: /usr/bin/procmail)
Jan 31 06:52:09 Luke postfix/qmgr[8796]: F3FF433E703: removed
However, when I run sa-stats.txt, I everything is empty. It must not be
getting the right log?? THANKS! -John
# perl ./sa-stats-1.0.txt -l /var/log/syslog.log
Email: 0 Autolearn: 0 AvgScore: 0.00 AvgScanTime: 0.00 sec
Spam: 0 Autolearn: 0 AvgScore: 0.00 AvgScanTime: 0.00 sec
Ham: 0 Autolearn: 0 AvgScore: 0.00 AvgScanTime: 0.00 sec
Time Spent Running SA: 0.00 hours
Time Spent Processing Spam: 0.00 hours
Time Spent Processing Ham: 0.00 hours
TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
----------------------------------------------------------------------
TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM
----------------------------------------------------------------------
----------------------------------------------------------------------
#
Re: Post your top 10 from sa-stats
Posted by Chris Purves <ch...@northfolk.ca>.
John Fleming wrote:
>>>> Wrong tool. Visit http://www.rulesemporium.com/ and find the
>>>> sa-stats.pl on their site. It is the one most of us are using. It
>>>> gives individual score breakdowns. The name coincidence is
>>>> regrettable.
>
>
> I have the "other sa-stats.pl" working well on my system. But I'm
> apparently not pointing the "other" version from RE to the log file
> correctly, as the results are all zero.
>
> Major perl inexperience here - Would someone pleez send me their config
> lines for the RE version?
>
Using the latest file from rules emporium, I made the file execuatable,
then:
./sa-stats-1.0.txt -l /var/log/spamassassin/ -f spamd.log
For help:
./sa-stats-1.0.txt -h
--
Good day, eh.
Chris
Re: Post your top 10 from sa-stats
Posted by jdow <jd...@earthlink.net>.
From: "John Fleming" <jo...@wa9als.com>
>>>>Wrong tool. Visit http://www.rulesemporium.com/ and find the
>>>>sa-stats.pl on their site. It is the one most of us are using. It
>>>>gives individual score breakdowns. The name coincidence is
>>>>regrettable.
>
> I have the "other sa-stats.pl" working well on my system. But I'm
> apparently not pointing the "other" version from RE to the log file
> correctly, as the results are all zero.
>
> Major perl inexperience here - Would someone pleez send me their config
> lines for the RE version?
Generally an sa-stats.pl will tell you what parameters are. The defaults
for the sa-stats.pl that come with SpamAssassin Tools are useless. I
went in and edited them. I made "end" be "today" and "start" be "yesterday."
It's maybe not ideal. But it functions as well as that puppy ever functions.
The SARE version works fine as long as it can figure out where the
mail log lives.
{^_^}
Re: Post your top 10 from sa-stats
Posted by John Fleming <jo...@wa9als.com>.
>>>Wrong tool. Visit http://www.rulesemporium.com/ and find the
>>>sa-stats.pl on their site. It is the one most of us are using. It
>>>gives individual score breakdowns. The name coincidence is
>>>regrettable.
I have the "other sa-stats.pl" working well on my system. But I'm
apparently not pointing the "other" version from RE to the log file
correctly, as the results are all zero.
Major perl inexperience here - Would someone pleez send me their config
lines for the RE version?
Thanks - John
Re: Post your top 10 from sa-stats
Posted by Chris Purves <ch...@northfolk.ca>.
Gene Heskett wrote:
> On Thursday 02 February 2006 00:36, jdow wrote:
>
>>Wrong tool. Visit http://www.rulesemporium.com/ and find the
>>sa-stats.pl on their site. It is the one most of us are using. It
>>gives individual score breakdowns. The name coincidence is
>>regrettable.
>
From an earlier posting by Dallas Engelken
SA 3.0.x - http://www.rulesemporium.com/programs/sa-stats.txt
SA 3.1.x - http://www.rulesemporium.com/programs/sa-stats-1.0.txt
--
Good day, eh.
Chris
Re: Post your top 10 from sa-stats
Posted by Gene Heskett <ge...@verizon.net>.
On Thursday 02 February 2006 00:36, jdow wrote:
>Wrong tool. Visit http://www.rulesemporium.com/ and find the
> sa-stats.pl on their site. It is the one most of us are using. It
> gives individual score breakdowns. The name coincidence is
> regrettable.
Unforch Joanne, I was not able to find a link that lead to that script
on that site. I'll check later as I bookmarked it, but it looked as if
maybe it wasn't all 'up' when I checked.
Thanks.
>{^_^}
>----- Original Message -----
>From: "Gene Heskett" <ge...@verizon.net>
>
>> Greetings;
>>
>> One of this threads messages prompted me to locate this script and
>> run it, which I found in the
>>
>> /usr/src/redhat/BUILD/Mail-SpamAssassin-3.1.0/tools/sa-stats.pl
>>
>> as if it hadn't been installed. Maybe it hasn't? Unforch, it would
>> appear that stats are not being kept as all categories report 0.
>>
>> What option do I need to set, and where, in order to enable this
>> 'record keeping'?
--
Cheers, Gene
People having trouble with vz bouncing email to me should add the word
'online' between the 'verizon', and the dot which bypasses vz's
stupid bounce rules. I do use spamassassin too. :-)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.
Re: Post your top 10 from sa-stats
Posted by jdow <jd...@earthlink.net>.
Wrong tool. Visit http://www.rulesemporium.com/ and find the sa-stats.pl
on their site. It is the one most of us are using. It gives individual
score breakdowns. The name coincidence is regrettable.
{^_^}
----- Original Message -----
From: "Gene Heskett" <ge...@verizon.net>
> Greetings;
>
> One of this threads messages prompted me to locate this script and run
> it, which I found in the
>
> /usr/src/redhat/BUILD/Mail-SpamAssassin-3.1.0/tools/sa-stats.pl
>
> as if it hadn't been installed. Maybe it hasn't? Unforch, it would
> appear that stats are not being kept as all categories report 0.
>
> What option do I need to set, and where, in order to enable this 'record
> keeping'?
Re: Post your top 10 from sa-stats
Posted by Gene Heskett <ge...@verizon.net>.
Greetings;
One of this threads messages prompted me to locate this script and run
it, which I found in the
/usr/src/redhat/BUILD/Mail-SpamAssassin-3.1.0/tools/sa-stats.pl
as if it hadn't been installed. Maybe it hasn't? Unforch, it would
appear that stats are not being kept as all categories report 0.
What option do I need to set, and where, in order to enable this 'record
keeping'?
Thanks.
--
Cheers, Gene
People having trouble with vz bouncing email to me should add the word
'online' between the 'verizon', and the dot which bypasses vz's
stupid bounce rules. I do use spamassassin too. :-)
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2006 by Maurice Eugene Heskett, all rights reserved.
Re: Post your top 10 from sa-stats
Posted by Andy Jezierski <aj...@stepan.com>.
Jeff Chan <je...@surbl.org> wrote on 02/01/2006 08:53:22 PM:
[snip]
> I'd recommend adding a rule for jp.surbl.org if you don't already
> have one. It's generally our best performing list currently. A
> sample rule is mentioned under "jp - jwSpamSpy + Prolocation data
> source" on our Quick Start page:
>
> http://www.surbl.org/
>
> Are you perhaps using a pre-3.1 version of SpamAssassin?
No, I'm running 3.1. Must have been a bad day for a sample. The JP rule
was #13 that day. Re-ran the stats for the previous week instead of a
single day:
5 URIBL_OB_SURBL 2110 3.30 5.35 30.20
0.02
6 URIBL_WS_SURBL 1787 2.80 4.53 25.58
0.10
7 URIBL_JP_SURBL 1752 2.74 4.44 25.08
0.01
25 URIBL_SC_SURBL 344 0.54 0.87 4.92
0.00
94 URIBL_AB_SURBL 94 0.15 0.24 1.35
0.00
Spam: 6987 Ham: 32462 Total: 39449
Maybe the greylisting is filtering out a lot of the spam that would
normally hit the JP list. I know it easily blocks at least 90% of spam
from even getting to SA in the first place. Before greylisting SA would
process 60-70 thousand spams per week, now it's usually less than 7000.
Andy
Re: Post your top 10 from sa-stats
Posted by Jeff Chan <je...@surbl.org>.
On Wednesday, February 1, 2006, 8:43:30 AM, Andy Jezierski wrote:
> Here's mine after making it through the RBL lists & Greylisting:
> TOP SPAM RULES FIRED
> --------------------------------------------------------------------------------
> RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM
> %OFHAM
> --------------------------------------------------------------------------------
> 1 HTML_MESSAGE 887 6.42 12.60 72.53
> 63.65
> 2 BAYES_99 807 5.84 11.46 65.99
> 0.22
> 3 DCC_CHECK 718 5.20 10.20 58.71
> 15.37
> 4 URIBL_BLACK 600 4.34 8.52 49.06
> 1.63
> 5 LG_4C_2V_3C 370 2.68 5.25 30.25
> 15.26
> 6 DIGEST_MULTIPLE 368 2.66 5.23 30.09
> 1.12
> 7 RAZOR2_CHECK 367 2.66 5.21 30.01
> 0.91
> 8 URIBL_OB_SURBL 320 2.32 4.54 26.17
> 0.00
> 9 RAZOR2_CF_RANGE_51_100 311 2.25 4.42 25.43
> 0.17
> 10 URIBL_WS_SURBL 289 2.09 4.10 23.63
> 0.03
I'd recommend adding a rule for jp.surbl.org if you don't already
have one. It's generally our best performing list currently. A
sample rule is mentioned under "jp - jwSpamSpy + Prolocation data
source" on our Quick Start page:
http://www.surbl.org/
Are you perhaps using a pre-3.1 version of SpamAssassin?
Jeff C.
--
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/
Re: Post your top 10 from sa-stats
Posted by Andy Jezierski <aj...@stepan.com>.
Here's mine after making it through the RBL lists & Greylisting:
TOP SPAM RULES FIRED
--------------------------------------------------------------------------------
RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM
%OFHAM
--------------------------------------------------------------------------------
1 HTML_MESSAGE 887 6.42 12.60 72.53
63.65
2 BAYES_99 807 5.84 11.46 65.99
0.22
3 DCC_CHECK 718 5.20 10.20 58.71
15.37
4 URIBL_BLACK 600 4.34 8.52 49.06
1.63
5 LG_4C_2V_3C 370 2.68 5.25 30.25
15.26
6 DIGEST_MULTIPLE 368 2.66 5.23 30.09
1.12
7 RAZOR2_CHECK 367 2.66 5.21 30.01
0.91
8 URIBL_OB_SURBL 320 2.32 4.54 26.17
0.00
9 RAZOR2_CF_RANGE_51_100 311 2.25 4.42 25.43
0.17
10 URIBL_WS_SURBL 289 2.09 4.10 23.63
0.03
--------------------------------------------------------------------------------
TOP HAM RULES FIRED
--------------------------------------------------------------------------------
RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM
%OFHAM
--------------------------------------------------------------------------------
1 HTML_MESSAGE 3703 14.95 52.59 72.53
63.65
2 BAYES_00 2852 11.52 40.51 3.11
49.02
3 BAYES_50 2092 8.45 29.71 17.09
35.96
4 NO_REAL_NAME 1035 4.18 14.70 7.85
17.79
5 DCC_CHECK 894 3.61 12.70 58.71
15.37
6 LG_4C_2V_3C 888 3.59 12.61 30.25
15.26
7 MIME_HTML_ONLY 615 2.48 8.73 21.83
10.57
8 SPF_HELO_PASS 447 1.81 6.35 15.37
7.68
9 DK_POLICY_SIGNSOME 429 1.73 6.09 4.91
7.37
10 DK_SIGNED 410 1.66 5.82 5.15
7.05
Post your top 10 from sa-stats
Posted by qqqq <qq...@usermail.com>.
Here is mine:
TOP SPAM RULES FIRED
------------------------------------------------------------
RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM
------------------------------------------------------------
1 URIBL_BLACK 257778 7.36 44.54 77.31 2.96
2 URIBL_JP_SURBL 193668 5.53 33.46 58.08 0.04
3 URIBL_SBL 178382 5.09 30.82 53.50 3.68
4 HTML_MESSAGE 177061 5.05 30.59 53.10 59.87
5 URIBL_WS_SURBL 162665 4.64 28.10 48.79 0.19
6 URIBL_OB_SURBL 144744 4.13 25.01 43.41 0.18
7 URIBL_SC_SURBL 140354 4.01 24.25 42.09 0.00
8 RCVD_IN_SORBS_DUL 99918 2.85 17.26 29.97 5.54
9 URIBL_AB_SURBL 87634 2.50 15.14 26.28 0.00
10 UNPARSEABLE_RELAY 67142 1.92 11.60 20.14 5.47
QQQQ
Re: Detecting phishing urls
Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, Jan 30, 2006 at 05:10:42PM -0500, Dan wrote:
> I was thinking of a regexp along the lines of:
> /href=\"https?:\/\/[0-9]{1,3}(\.[0-9]{1,3}){3}[^>]+>http:\/\/\w/i
>
> It's not perfect, but it would detect the above scenerio.
Yes and no. You'd have to do a rawbody, and it doesn't take into account
newlines and what-not. It also is a lot more work.
> What does check_https_ip_mismatch() do?
The short version is that it looks for an href to an IP when the anchor
text is https to a non-IP, aka:
<a href="http://123.123.123.123">https://www.ebay.com/</a>
Other variations of that showed pretty horrible results. This type of
thing was discussed before on the list, and in at least BZ ticket 4255:
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4255
--
Randomly Generated Tagline:
When the blind lead the blind they will both fall over the cliff.
-- Chinese proverb
Re: Detecting phishing urls
Posted by Dan <ka...@gmail.com>.
On 1/30/06, Theo Van Dinter <fe...@apache.org> wrote:
> On Mon, Jan 30, 2006 at 11:48:17AM -0500, Dan wrote:
> > <a href="http://123.123.123.123/fraud_uri">http://amazon.com/official_looking_path</a>
> > I can write a regexp that looks for an address in the <a> tag's body
> > that is different than in it's href, but I figured someone else had
> > already written one. Can someone point me in the right direction?
>
> You can't do this in a regexp, you need to write some code. There's already
> the check_https_ip_mismatch() function which looks for something similar to
> this. It turns out that href != anchor text is a pretty bad spam sign since
> it happens in ham all the time.
I was thinking of a regexp along the lines of:
/href=\"https?:\/\/[0-9]{1,3}(\.[0-9]{1,3}){3}[^>]+>http:\/\/\w/i
It's not perfect, but it would detect the above scenerio.
What does check_https_ip_mismatch() do?
-Dan
Re: Detecting phishing urls
Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, Jan 30, 2006 at 11:48:17AM -0500, Dan wrote:
> <a href="http://123.123.123.123/fraud_uri">http://amazon.com/official_looking_path</a>
> I can write a regexp that looks for an address in the <a> tag's body
> that is different than in it's href, but I figured someone else had
> already written one. Can someone point me in the right direction?
You can't do this in a regexp, you need to write some code. There's already
the check_https_ip_mismatch() function which looks for something similar to
this. It turns out that href != anchor text is a pretty bad spam sign since
it happens in ham all the time.
--
Randomly Generated Tagline:
"90% of this game is half mental." - Yogi Berra