You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jari Fredriksson <ja...@iki.fi> on 2009/10/17 02:12:49 UTC

KHOP_NO_FULL_NAME

I have not yet analysed what whitehats cause this, but this rule seems
suspipicious to me at moment.

At the bright side: HOSTKARMA is a pleasant thing to have, now that my
config is fixed with the community aid.


Email: 1280  Autolearn: 765  AvgScore:  13.53  AvgScanTime: 11.23 sec
Spam:   632  Autolearn: 540  AvgScore:  34.39  AvgScanTime:  9.21 sec
Ham:    648  Autolearn: 225  AvgScore:  -6.82  AvgScanTime: 13.19 sec

Time Spent Running SA:         3.99 hours
Time Spent Processing Spam:    1.62 hours
Time Spent Processing Ham:     2.37 hours

TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM
----------------------------------------------------------------------
    1    BAYES_99                          614    47.97   97.15    0.00
    2    DCC_CHECK                         601    60.86   95.09   27.47
    3    RAZOR2_CHECK                      576    45.00   91.14    0.00
    4    RCVD_IN_BRBL_LASTEXT              575    45.08   90.98    0.31
    5    RAZOR2_CF_RANGE_51_100            573    44.77   90.66    0.00
    6    HTML_MESSAGE                      570    50.39   90.19   11.57
    7    BOTNET                            566    44.22   89.56    0.00
    8    DIGEST_MULTIPLE                   559    43.67   88.45    0.00
    9    URIBL_BLACK                       551    43.05   87.18    0.00
   10    RAZOR2_CF_RANGE_E8_51_100         541    42.27   85.60    0.00
   11    RCVD_IN_HOSTKARMA_BL              539    42.11   85.28    0.00
   12    URIBL_SBL                         509    39.77   80.54    0.00
   13    URIBL_JP_SURBL                    502    39.22   79.43    0.00
   14    RCVD_IN_XBL                       491    38.36   77.69    0.00
   15    URIBL_WS_SURBL                    426    33.28   67.41    0.00
   16    RCVD_IN_BL_SPAMCOP_NET            425    33.20   67.25    0.00
   17    RCVD_IN_SEMBLACK                  418    32.66   66.14    0.00
   18    RCVD_IN_PSBL                      408    31.87   64.56    0.00
   19    KHOP_DNSBL_ADJ                    405    31.64   64.08    0.00
   20    URIBL_AB_SURBL                    374    29.22   59.18    0.00
----------------------------------------------------------------------

TOP HAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM
----------------------------------------------------------------------
    1    BAYES_00                          543    42.50    0.16   83.80
    2    RCVD_IN_HOSTKARMA_W               511    40.00    0.16   78.86
    3    AWL                               498    44.06   10.44   76.85
    4    KHOP_RCVD_UNTRUST                 420    32.89    0.16   64.81
    5    KHOP_HELO_FCRDNS                  312    32.97   17.41   48.15
    6    KHOP_NO_FULL_NAME                 195    19.30    8.23   30.09
    7    RCVD_IN_HOSTKARMA_WL              182    14.30    0.16   28.09
    8    DCC_CHECK                         178    60.86   95.09   27.47
    9    RCVD_IN_DNSWL_LOW                 171    13.44    0.16   26.39
   10    RCVD_IN_DNSWL_MED                 170    13.28    0.00   26.23
   11    SPF_HELO_PASS                     160    12.50    0.00   24.69
   12    RCVD_IN_DNSWL_HI                  159    12.42    0.00   24.54
   13    DKIM_SIGNED                       114     8.98    0.16   17.59
   14    RCVD_IN_BSP_OTHER                  78     6.09    0.00   12.04
   15    HTML_MESSAGE                       75    50.39   90.19   11.57
   16    KHOP_RCVD_TRUST                    49     3.83    0.00    7.56
   17    DKIM_VERIFIED                      42     3.28    0.00    6.48
   18    KHOP_2IPS_RCVD                     32     3.44    1.90    4.94
   19    MIME_QP_LONG_LINE                  27     2.89    1.58    4.17
   20    KHOP_PGP_SIGNED                    22     1.72    0.00    3.40
----------------------------------------------------------------------


--
http://www.iki.fi/jarif/

Ships are safe in harbor, but they were never meant to stay there.

Re: KHOP_NO_FULL_NAME

Posted by Adam Katz <an...@khopis.com>.
Henrik K wrote:
> Sorry but it's not worth that either.. it's not just "people" who
> send mail and even people have nicknames and whatever in their name
> fields. That should have been reasoned from the beginning.
> 
> You should really get a mass check account and not test dubious rules
> on your public channels.

Yeah, that's what I get for testing rules in such a small environment.
Within the corporate world, it makes more sense ... and even then was a
low-grade rule.  My testing mechanism could use some work, and getting a
mass check account would certainly trump it.

What steps do I need to go through to gain this access?  What
responsibilities does it entail?  Warren keeps asking me to submit
corpus data too; is there documentation on that (how cleaned it needs to
be, as my users just don't train, how privacy works, implementation
notes, etc)?

There doesn't appear to be anything related to this on the wiki.
http://wiki.apache.org/spamassassin/DevelopmentStuff

Re: KHOP_NO_FULL_NAME

Posted by Gene Heskett <ge...@verizon.net>.
On Sunday 18 October 2009, jdow wrote:
>From: "Nix" <ni...@esperi.org.uk>
>Sent: Sunday, 2009/October/18 13:24
>
>> On 18 Oct 2009, Henrik K. said:
>>> On Sat, Oct 17, 2009 at 07:22:19PM -0400, Adam Katz wrote:
>>>> Keep in mind that this rule is only worth 0.259.
>>>
>>> Sorry but it's not worth that either.. it's not just "people" who send
>>> mail
>>> and even people have nicknames and whatever in their name fields.
>>
>> Indeed we do :)
>
>As one of perhaps the earliest victims of an online stalking incident
>I expect people will forgive me for simply going by the four letters
>phonetically rendered as "Jolly Dirty Old Woman."
>
>{^_-}
>
Wouldn't have it any other way my dear (on a public list anyway), unless it 
might be the 'wizardess'.  But that also dates things, darnit.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Most people's favorite way to end a game is by winning.

Re: KHOP_NO_FULL_NAME

Posted by jdow <jd...@earthlink.net>.
From: "Nix" <ni...@esperi.org.uk>
Sent: Sunday, 2009/October/18 13:24


> On 18 Oct 2009, Henrik K. said:
>
>> On Sat, Oct 17, 2009 at 07:22:19PM -0400, Adam Katz wrote:
>>> Keep in mind that this rule is only worth 0.259.
>>
>> Sorry but it's not worth that either.. it's not just "people" who send 
>> mail
>> and even people have nicknames and whatever in their name fields.
>
> Indeed we do :)

As one of perhaps the earliest victims of an online stalking incident
I expect people will forgive me for simply going by the four letters
phonetically rendered as "Jolly Dirty Old Woman."

{^_-} 


Re: KHOP_NO_FULL_NAME

Posted by Nix <ni...@esperi.org.uk>.
On 18 Oct 2009, Henrik K. said:

> On Sat, Oct 17, 2009 at 07:22:19PM -0400, Adam Katz wrote:
>> Keep in mind that this rule is only worth 0.259.
>
> Sorry but it's not worth that either.. it's not just "people" who send mail
> and even people have nicknames and whatever in their name fields.

Indeed we do :)

Re: KHOP_NO_FULL_NAME

Posted by Henrik K <he...@hege.li>.
On Sat, Oct 17, 2009 at 07:22:19PM -0400, Adam Katz wrote:
> Jari Fredriksson quoted himself (both on the 17th):
> >> I have not yet analysed what whitehats cause this, but this rule seems
> >> suspipicious to me at moment.
> > 
> > Now I have. Legitimate bulk mailers.
> > 
> > From: "NYTimes.com" <ny...@nytimes.com>
> > From: "Iltalehti.fi" <il...@sp.iltalehti.fi>
> > 
> > Newspapers. And others. Guestionable rule.
> 
> Ah.  Interesting.  I had been suspecting either an older bug regarding
> foreign characters in correct proper names had resurfaced or you had a
> lot of correspondence with people who don't capitalize their names or
> include a last name.
> 
> I've updated the rule so that it won't fire on any mail claiming
> precedence of "bulk" or "list," which should solve that issue (and
> unfortunately fire less often on real spam too).
> 
> Keep in mind that this rule is only worth 0.259.

Sorry but it's not worth that either.. it's not just "people" who send mail
and even people have nicknames and whatever in their name fields. That
should have been reasoned from the beginning.

You should really get a mass check account and not test dubious rules on
your public channels.

All

OVERALL    SPAM%     HAM%     S/O    RANK   SCORE  NAME
      0    37048   157173    0.191   0.00    0.00  (all messages)
0.00000  19.0752  80.9248    0.191   0.00    0.00  (all messages as %)
 29.539  25.1943  30.5625    0.452   0.00    0.01  T_KHOP_NO_FULL_NAME

Without bulk|list

 28.042  25.1107  28.7327    0.466   0.00    0.01  T_KHOP_NO_FULL_NAME


Re: KHOP_NO_FULL_NAME

Posted by Jari Fredriksson <ja...@iki.fi>.

18.10.2009 2:22, Adam Katz kirjoitti:
> Jari Fredriksson quoted himself (both on the 17th):
>>> I have not yet analysed what whitehats cause this, but this rule seems
>>> suspipicious to me at moment.
>>
>> Now I have. Legitimate bulk mailers.
>>
>> From: "NYTimes.com"<ny...@nytimes.com>
>> From: "Iltalehti.fi"<il...@sp.iltalehti.fi>
>>
>> Newspapers. And others. Guestionable rule.
>
> Ah.  Interesting.  I had been suspecting either an older bug regarding
> foreign characters in correct proper names had resurfaced or you had a
> lot of correspondence with people who don't capitalize their names or
> include a last name.
>
> I've updated the rule so that it won't fire on any mail claiming
> precedence of "bulk" or "list," which should solve that issue (and
> unfortunately fire less often on real spam too).
>

Sadly, from those examples, NYTimes does not add Predence header to
their mail, Iltalehti does.

One with three names: The Washington Post does not trigger this rule.

There are others, like "Sun Microsystems" who luckily have two names,
but then "Microsoft" does not. Microsoft does not add Precedence header
either.

> Keep in mind that this rule is only worth 0.259.

Ok. Not dangerous, as they do not trigger many other rules. Maybe
illegal characters in headers (usual in Finland, they put 8-bit chars
there without mime-encoding), and HTML_ONLY and DCC_CHECK.


Re: KHOP_NO_FULL_NAME

Posted by Adam Katz <an...@khopis.com>.
Jari Fredriksson quoted himself (both on the 17th):
>> I have not yet analysed what whitehats cause this, but this rule seems
>> suspipicious to me at moment.
> 
> Now I have. Legitimate bulk mailers.
> 
> From: "NYTimes.com" <ny...@nytimes.com>
> From: "Iltalehti.fi" <il...@sp.iltalehti.fi>
> 
> Newspapers. And others. Guestionable rule.

Ah.  Interesting.  I had been suspecting either an older bug regarding
foreign characters in correct proper names had resurfaced or you had a
lot of correspondence with people who don't capitalize their names or
include a last name.

I've updated the rule so that it won't fire on any mail claiming
precedence of "bulk" or "list," which should solve that issue (and
unfortunately fire less often on real spam too).

Keep in mind that this rule is only worth 0.259.

Also note my post on the 14th:
> KHOP_NO_FULL_NAME might be mis-firing.  It's supposed to detect a
> properly formatted name, in the form (sans quotes):  "A K" or "Adam K"
> or "A Katz" ... maybe somebody can find a flaw in my regex or an example
> FP or FN?  Here it is, please be careful decoding the wrapping:
> 
> # This matches foreign characters by process of elimination.
> # From: must start w/ ~upper, ~letters, space/punctuation, then ~upper
> header   __FROM_FULL_NAME       From:name =~
> /^[^a-z[:punct:][:cntrl:]\d\s][^[:punct:][:cntrl:]\d\s]*[[:punct:]\s]+[^a-z[:punct:][:cntrl:]\d\s]/
> meta KHOP_NO_FULL_NAME
> !(__FROM_ENCODED_QP||__FROM_NEEDS_MIME||__FROM_FULL_NAME)
> describe KHOP_NO_FULL_NAME      Sender does not have both First and Last
> names
> score    KHOP_NO_FULL_NAME      0.259 # keep low!


Re: KHOP_NO_FULL_NAME

Posted by Jari Fredriksson <ja...@iki.fi>.

17.10.2009 3:12, Jari Fredriksson kirjoitti:
>
> I have not yet analysed what whitehats cause this, but this rule seems
> suspipicious to me at moment.
>

Now I have. Legitimate bulk mailers.

From: "NYTimes.com" <ny...@nytimes.com>
From: "Iltalehti.fi" <il...@sp.iltalehti.fi>

Newspapers. And others. Guestionable rule.

--
http://www.iki.fi/jarif/

You look tired.