You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Sharma, Ashish" <as...@hp.com> on 2010/06/21 21:25:54 UTC
unable to find logic behind spamassassin rule
Hi,
I have the latest version of spamassassin, I am unable to find the logic behind the following rule and it's high spam score.
MANY_SPAN_IN_TEXT 3.099
Can anybody give a reason?
Thanks in advance
Ashish Sharma
Re: unable to find logic behind spamassassin rule
Posted by John Hardin <jh...@impsec.org>.
On Mon, 21 Jun 2010, Bowie Bailey wrote:
> Michael Scheidell wrote:
>> On 6/21/10 3:25 PM, Sharma, Ashish wrote:
>>>
>>> I have the latest version of spamassassin, I am unable to find the
>>> logic behind the following rule and it's high spam score.
>>>
>>> MANY_SPAN_IN_TEXT 3.099
>>>
>>> Can anybody give a reason?
>>
>> 72_scores.cf:score MANY_SPAN_IN_TEXT 1.862 2.398 1.862 2.398
>
> 72_active.cf:rawbody __SPAN_BEG_TEXT /[a-z]{2}<(?i:span)\s/
> 72_active.cf:tflags __SPAN_BEG_TEXT multiple
> 72_active.cf:rawbody __SPAN_END_TEXT /[^;>]<\/(?i:span)>[a-z]{3}/
> 72_active.cf:tflags __SPAN_END_TEXT multiple
>
> In other words, the message has more than 4 <span> tags and more than 4
> </span> tags.
It's slightly more than that. There aren't just <span> tags, there are
<span> tags embedded within lowercase text. It appears to be a way to try
to break pattern matching on spammy words, by dropping a <span></span> tag
pair in the middle:
via<span>sausage</span>gra
this renders visually as a single word commonly seen in pharma spam, but a
naive string matching spam filter may be spoofed and miss it.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Gun Control laws cannot reduce violent crime, because gun control
laws focus obsessively on a tool a criminal might use to commit a
crime rather than the criminal himself and his act of violence.
-----------------------------------------------------------------------
13 days until the 234th anniversary of the Declaration of Independence
Re: unable to find logic behind spamassassin rule
Posted by Bowie Bailey <Bo...@BUC.com>.
Michael Scheidell wrote:
> On 6/21/10 3:25 PM, Sharma, Ashish wrote:
>> Hi,
>>
>> I have the latest version of spamassassin, I am unable to find the
>> logic behind the following rule and it's high spam score.
>>
>> MANY_SPAN_IN_TEXT 3.099
>>
>>
>> Can anybody give a reason?
>>
>>
> grep MANY_SPAN_IN_TEXT *
> 72_active.cf:##{ MANY_SPAN_IN_TEXT
> 72_active.cf:meta MANY_SPAN_IN_TEXT (__SPAN_BEG_TEXT > 4)
> && (__SPAN_END_TEXT > 4)
> 72_active.cf:describe MANY_SPAN_IN_TEXT Many <SPAN> tags
> embedded within text
> 72_active.cf:##} MANY_SPAN_IN_TEXT
> 72_scores.cf:score MANY_SPAN_IN_TEXT 1.862 2.398
> 1.862 2.398
72_active.cf:rawbody __SPAN_BEG_TEXT /[a-z]{2}<(?i:span)\s/
72_active.cf:tflags __SPAN_BEG_TEXT multiple
72_active.cf:rawbody __SPAN_END_TEXT /[^;>]<\/(?i:span)>[a-z]{3}/
72_active.cf:tflags __SPAN_END_TEXT multiple
In other words, the message has more than 4 <span> tags and more than 4
</span> tags. The scores are generated automatically based on the fact
that this pattern matches much more often on spam messages than on ham
messages. If it is causing problems for you, you can override the score
in your local.cf file like this:
score MANY_SPAN_IN_TEXT 1.0
Use whatever score you want. A score of 0 will disable the rule.
--
Bowie
Re: unable to find logic behind spamassassin rule
Posted by Michael Scheidell <sc...@secnap.net>.
On 6/21/10 3:25 PM, Sharma, Ashish wrote:
> Hi,
>
> I have the latest version of spamassassin, I am unable to find the logic behind the following rule and it's high spam score.
>
> MANY_SPAN_IN_TEXT 3.099
>
>
> Can anybody give a reason?
>
>
grep MANY_SPAN_IN_TEXT *
72_active.cf:##{ MANY_SPAN_IN_TEXT
72_active.cf:meta MANY_SPAN_IN_TEXT (__SPAN_BEG_TEXT > 4) &&
(__SPAN_END_TEXT > 4)
72_active.cf:describe MANY_SPAN_IN_TEXT Many <SPAN> tags
embedded within text
72_active.cf:##} MANY_SPAN_IN_TEXT
72_scores.cf:score MANY_SPAN_IN_TEXT 1.862 2.398
1.862 2.398
> Thanks in advance
>
> Ashish Sharma
>
--
Michael Scheidell, CTO
Phone: 561-999-5000, x 1259
> *| *SECNAP Network Security Corporation
* Certified SNORT Integrator
* 2008-9 Hot Company Award Winner, World Executive Alliance
* Five-Star Partner Program 2009, VARBusiness
* Best Anti-Spam Product 2008, Network Products Guide
* King of Spam Filters, SC Magazine 2008
______________________________________________________________________
This email has been scanned and certified safe by SpammerTrap(r).
For Information please see http://www.secnap.com/products/spammertrap/
______________________________________________________________________
Re: unable to find logic behind spamassassin rule
Posted by Michael Scheidell <li...@secnap.com>.
On 6/21/10 3:25 PM, Sharma, Ashish wrote:
> Hi,
>
> I have the latest version of spamassassin, I am unable to find the logic behind the following rule and it's high spam score.
>
> MANY_SPAN_IN_TEXT 3.099
>
>
>
as for the scoring, it is done autoomaticallay, checking how much 'ham'
has more than 4 <span>jlkjlkj</span> tags, vs ham.
the current scoreing is score MANY_SPAN_IN_TEXT
1.862 2.398 1.862 2.398
based on if net, learning, bayes set, etc.
> Can anybody give a reason?
>
> Thanks in advance
>
> Ashish Sharma
>
______________________________________________________________________
This email has been scanned and certified safe by SpammerTrap(r).
For Information please see http://www.secnap.com/products/spammertrap/
______________________________________________________________________