You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Marc Perkel <su...@junkemailfilter.com> on 2016/01/20 22:59:55 UTC

Another way to use my filter in SA

Here's another way to use my evolution filtering idea with SA.

Get rid of all the rule scores and just make a list of the rule names. 
 From the rule names generate all combinations of those rule names up to 
4 rule names in a fingerprint and learn those fingerprints as either ham 
or spam. sort of like this:

“A” “AB” “B” “C” “AC” “ABC” “BC” “D” “AD” “ABD” “BD” “CD” “ACD” “ABCD” 
“BCD” “E” “AE” “BE” “CE” “ACE” “BCE” “DE” “ADE” “ABDE” “BDE” “CDE” 
“ACDE” “ABCDE” “BCDE”

Then - when a new message comes in you make the same combo of 
fingerprints from the rule names and then use my formula.

card(Test intersect Spam diff Ham) - card(Test Intersect Ham diff Spam)

Positive result = spam
Negative result = ham

Re: Another way to use my filter in SA

Posted by Reindl Harald <h....@thelounge.net>.


Am 20.01.2016 um 22:59 schrieb Marc Perkel:
> Here's another way to use my evolution filtering idea with SA.
>
> Get rid of all the rule scores and just make a list of the rule names

no, the whole point of rule scores and bayes or whatever analysis is 
just another rule score is balance / auto-correction to avoid wrong 
results just because of single mistakes (mistrain, bad composed mail)

besides that URIBL and DNSBL/DNSWL and SPF and what not is another point 
of balance

>  From the rule names generate all combinations of those rule names up to
> 4 rule names in a fingerprint and learn those fingerprints as either ham
> or spam. sort of like this:
>
> “A” “AB” “B” “C” “AC” “ABC” “BC” “D” “AD” “ABD” “BD” “CD” “ACD” “ABCD”
> “BCD” “E” “AE” “BE” “CE” “ACE” “BCE” “DE” “ADE” “ABDE” “BDE” “CDE”
> “ACDE” “ABCDE” “BCDE”
>
> Then - when a new message comes in you make the same combo of
> fingerprints from the rule names and then use my formula.

as the only decision?
for sure not

> card(Test intersect Spam diff Ham) - card(Test Intersect Ham diff Spam)
>
> Positive result = spam
> Negative result = ham
nothing but the summary of tests (except explicit whitelists or 
blacklists) has to be allowed for a final decision

Re: Another way to use my filter in SA

Posted by Marc Perkel <su...@junkemailfilter.com>.

On 01/20/16 14:28, John Hardin wrote:
> On Wed, 20 Jan 2016, Marc Perkel wrote:
>
>> Here's another way to use my evolution filtering idea with SA.
>>
>> Get rid of all the rule scores and just make a list of the rule 
>> names. From the rule names generate all combinations of those rule 
>> names up to 4 rule names in a fingerprint and learn those 
>> fingerprints as either ham or spam. sort of like this:
>>
>> “A” “AB” “B” “C” “AC” “ABC” “BC” “D” “AD” “ABD” “BD” “CD” “ACD” 
>> “ABCD” “BCD” “E” “AE” “BE” “CE” “ACE” “BCE” “DE” “ADE” “ABDE” “BDE” 
>> “CDE” “ACDE” “ABCDE” “BCDE”
>>
>> Then - when a new message comes in you make the same combo of 
>> fingerprints from the rule names and then use my formula.
>>
>> card(Test intersect Spam diff Ham) - card(Test Intersect Ham diff Spam)
>>
>> Positive result = spam
>> Negative result = ham
>
> Unfortunately this also requires training. It would render SA a 
> product that does not work out-of-the-box.
>

Actually it could include a pretrained corpus on the rules at least to 
get people started. Could also have someone (like me?) provide it as a 
service that SA would talk to. SA would send the tokens to the service 
and the service would return a score.

-- 
Marc Perkel - Sales/Support
support@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400

Re: Another way to use my filter in SA

Posted by John Hardin <jh...@impsec.org>.

On Wed, 20 Jan 2016, Marc Perkel wrote:

> Here's another way to use my evolution filtering idea with SA.
>
> Get rid of all the rule scores and just make a list of the rule names. From 
> the rule names generate all combinations of those rule names up to 4 rule 
> names in a fingerprint and learn those fingerprints as either ham or spam. 
> sort of like this:
>
> “A” “AB” “B” “C” “AC” “ABC” “BC” “D” “AD” “ABD” “BD” “CD” “ACD” “ABCD” “BCD” 
> “E” “AE” “BE” “CE” “ACE” “BCE” “DE” “ADE” “ABDE” “BDE” “CDE” “ACDE” “ABCDE” 
> “BCDE”
>
> Then - when a new message comes in you make the same combo of fingerprints 
> from the rule names and then use my formula.
>
> card(Test intersect Spam diff Ham) - card(Test Intersect Ham diff Spam)
>
> Positive result = spam
> Negative result = ham

Unfortunately this also requires training. It would render SA a product 
that does not work out-of-the-box.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The problem is when people look at Yahoo, slashdot, or groklaw and
   jump from obvious and correct observations like "Oh my God, this
   place is teeming with utter morons" to incorrect conclusions like
   "there's nothing of value here".        -- Al Petrofsky, in Y! SCOX
-----------------------------------------------------------------------
  3 days until John Moses Browning's 161st Birthday