You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Marc Perkel <ma...@perkel.com> on 2009/10/09 17:14:44 UTC

SA needs a new paradigm for rule structure

I've brought this idea up over the years but I'll try to explain it in a 
different way. Maybe we can do this with a lot of meta rules.

What we need are rules that combine a lot of simple rules into concepts 
and then combine those rules into rules that score - and score big. As 
an example, lets take a standard nigerian scam email.

 From <> reply to:

[I don't know you] Dear stranger, I am mr, ms. mrs. my name is

[I am connected] I am a soldier in Iraq, I and the daughter of an 
african president, I work at a bank in hong hong

[I have money] I have the sum of 56 million dollars USD

[the money is hot] no beneficiaries, sneak it out of the country, 
oppressive regime

[transfer to your account] splitting the funds, wire to your account

[i need you information] name, address, account number

[i want you to contact me] by email, phone

[keep this a secret] confidential discretion

So - we create a lot of simple rules with no points with key words and 
phases and then combine these rules using meta rules to get these 
concepts. That way we have a meta rule like, "they don't know me" "that 
are talking about transferring millions" "they want my information" 
"they are talking about hot money". Then you combine those concepts into 
rules that can definitively determine it is spam.

And - I am still looking for someone who might do baysian or some other 
automatic system that looks for rule combinations and increases scores 
based on that.


Re: SA needs a new paradigm for rule structure

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2009-10-09 at 11:34 -0700, John Hardin wrote:
> On Fri, 9 Oct 2009, Karsten Bräckelmann wrote:

> > Whoa, dude! You just left the heavy sarcasm in, and snipped everything
> > from the quote that clarifies this statement and identifies it as
> > sarcasm.
> 
> I suspect that Alex was responding to Mark rather than to you, and he 
> agrees with your sarcasm... The "add" suggests this.

Hmm -- you might actually be right. :)  Maybe I misunderstood the intent
and jumped to conclusions based on reading it once.

(But then again, too many folks don't even read something fully once,
before they form an opinion, chime in, or start acting... :/ )


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: SA needs a new paradigm for rule structure

Posted by John Hardin <jh...@impsec.org>.
On Fri, 9 Oct 2009, Karsten Br�ckelmann wrote:

> On Fri, 2009-10-09 at 13:28 -0400, "Alex" / "MySQL Student" wrote:
>
>>>> What we need are rules that combine a lot of simple rules into concepts
>>>> and then combine those rules into rules that score - and score big. As
>>>> an example, [...]
>>>
>>> Yes, SA definitely needs that and sorely lacks this ultimate feature!
>>
>> Can I respectfully add [...]
>
> Whoa, dude! You just left the heavy sarcasm in, and snipped everything
> from the quote that clarifies this statement and identifies it as
> sarcasm.

I suspect that Alex was responding to Mark rather than to you, and he 
agrees with your sarcasm... The "add" suggests this.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   When I say "I don't want the government to do X", do not
   automatically assume that means I don't want X to happen.
-----------------------------------------------------------------------
  8 days since a sunspot last seen - EPA blames CO2 emissions

Re: Fwd: SA needs a new paradigm for rule structure

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2009-10-09 at 15:31 -0400, MySQL Student wrote:
> I sent this message more than an hour ago, and it looks like it's yet
> to hit the list. Resending.

Indeed -- there was an issue with athene for a short period rejecting
valid mailing list posts. Got my previous reply to this thread around
that very time back as well.


> > Whoa, dude! You just left the heavy sarcasm in, and snipped everything
> > from the quote that clarifies this statement and identifies it as
> > sarcasm.
> 
> Yes, I'm really sorry about that. I didn't think that it would not be
> interpreted as sarcasm with the way I quoted it, but looking at it
> now, I see that it might.

Not that a big deal actually. :)  I just had to clarify, to minimize the
odds of lurkers and archive readers to interpret it out of context and
draw the wrong conclusions. Hence my pointer to the full post.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Fwd: SA needs a new paradigm for rule structure

Posted by MySQL Student <my...@gmail.com>.
Hi,

I sent this message more than an hour ago, and it looks like it's yet
to hit the list. Resending.

Thanks,
Alex

---------- Forwarded message ----------
From: MySQL Student <my...@gmail.com>
Date: Fri, Oct 9, 2009 at 2:34 PM
Subject: Re: SA needs a new paradigm for rule structure
To: SA Mailing list <us...@spamassassin.apache.org>


Hi,

>> > > What we need are rules that combine a lot of simple rules into concepts
>> > > and then combine those rules into rules that score - and score big. As
>> > > an example, [...]
>> >
>> > Yes, SA definitely needs that and sorely lacks this ultimate feature!
>>
>> Can I respectfully add [...]
>
> Whoa, dude! You just left the heavy sarcasm in, and snipped everything
> from the quote that clarifies this statement and identifies it as
> sarcasm.

Yes, I'm really sorry about that. I didn't think that it would not be
interpreted as sarcasm with the way I quoted it, but looking at it
now, I see that it might.

Best,
Alex

Re: SA needs a new paradigm for rule structure

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2009-10-09 at 13:28 -0400, "Alex" / "MySQL Student" wrote:

> > > What we need are rules that combine a lot of simple rules into concepts
> > > and then combine those rules into rules that score - and score big. As
> > > an example, [...]
> >
> > Yes, SA definitely needs that and sorely lacks this ultimate feature!
> 
> Can I respectfully add [...]

Whoa, dude! You just left the heavy sarcasm in, and snipped everything
from the quote that clarifies this statement and identifies it as
sarcasm.

I did NOT mean, imply or actually claim as you quoted.

Please do not quote out-of-context, and do not change the meaning of
quoted text like that -- regardless whether you leave the attribution
line in, or not.


The paragraph above, starting with "Yes" is utterly wrong as-is, without
context. Please see my full, original reply.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: SA needs a new paradigm for rule structure

Posted by MySQL Student <my...@gmail.com>.
Hi,

>> What we need are rules that combine a lot of simple rules into concepts
>> and then combine those rules into rules that score - and score big. As
>> an example, [...]
>
> Yes, SA definitely needs that and sorely lacks this ultimate feature!

Can I respectfully add to this that John Hardin has already done what
I think you're describing in his lotsa_money and advance_fee rules:

http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/sandbox/jhardin/

Regards,
Alex

Re: SA needs a new paradigm for rule structure

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2009-10-09 at 11:40 -0700, Marc Perkel wrote:
> Karsten Bräckelmann wrote: 
> > Maybe you really should read up on some docs, and actually have a look
> > at the stock rules, as well as some third-party rule-sets. Just to see
> > your "innovative" concept already in use. And maybe even understand
> > it...  

> What I'm suggesting to a new structure of rules, done by the
> community, and combine phrases into concept and then combine those
> concepts into new rules that score. This can be done using the
> existing structure but rewriting a lot of rules and creating a new
> rule infrastructure.

This IS being done in a number of stock and third-party rule-sets. There
still is nothing new here.

The only -- albeit not new either -- thing you actually suggested in the
above paragraph, is to have volunteers. Feel free to.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: SA needs a new paradigm for rule structure

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Fri, 2009-10-09 at 08:14 -0700, Marc Perkel wrote:
> I've brought this idea up over the years but I'll try to explain it in a 
> different way. Maybe we can do this with a lot of meta rules.
> 
> What we need are rules that combine a lot of simple rules into concepts 
> and then combine those rules into rules that score - and score big. As 
> an example, [...]

Yes, SA definitely needs that and sorely lacks this ultimate feature!

Marc, I'm kind of speechless. What you just described are non-scoring
sub-rules and meta rules. A concept that DOES EXIST in SA, and IS being
used. A concept you had the list explain to you just a few hours ago,
instead of reading the docs.

Now that you explained how to use what we just taught you -- what is new
about it?


> So - we create a lot of simple rules with no points with key words and 
> phases and then combine these rules using meta rules to get these 
> concepts. That way we have a meta rule like, "they don't know me" "that 
> are talking about transferring millions" "they want my information" 
> "they are talking about hot money". Then you combine those concepts into 
> rules that can definitively determine it is spam.
                 ^^^^^^^^^^^^
SA is a scoring system, and part of its fundamental philosophy is, that
there is NO single rule that definitely [1] determines spamminess.
Instead, lots of rules contribute parts of the overall result.

Still kind of speechless.

Maybe you really should read up on some docs, and actually have a look
at the stock rules, as well as some third-party rule-sets. Just to see
your "innovative" concept already in use. And maybe even understand
it...


[1] Your very example is likely to FP...

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: SA needs a new paradigm for rule structure

Posted by RW <rw...@googlemail.com>.
On Mon, 12 Oct 2009 10:49:06 -0700
Ted Mittelstaedt <te...@ipinc.net> wrote:


> I think if you sit down and start trying to define examples
> and run them through large databases of spam and ham you
> will find that it doesen't work the way you think it does.  That
> is what I was talking about when I said that statistical
> mathematics has parts that are non-intuitive.

I think what you are saying is, you tried it and it didn't work for
you. That's doesn't mean that it can't be made to work - the basic
principle is sound. 

One way I think it might be done is to tokenize large corpora
of ham and spam (mainly fraud), and look for token combinations that are
very strong spam indicators. For example I suspect the simple
two-token combination of lottery+barrister is a pretty reliable
indicator. 

Meta-rules would be an inefficient way of implementing it though.

> The reason you probably think that "meta" rules work better
> is because you have created meta rules that are in reality,
> a grouping of a useless rules with a useful rule.  Thus, giving
> the illusion that "a rule that isn't scoring individually"
> actually is scoring when in a meta rule.

Sometimes meta-rules just make more sense, "paypal" and "yahoo" in the
same From header is worth scoring,  "paypal" or "yahoo" isn't.

Re: SA needs a new paradigm for rule structure

Posted by Ted Mittelstaedt <te...@ipinc.net>.
RW wrote:
> On Fri, 09 Oct 2009 23:40:01 -0700
> Ted Mittelstaedt <te...@ipinc.net> wrote:
> 
> 
>> I know that it seems like the idea of building up "meta" rules with
>> a lot of small rules will give you a more accurate hit rate, but
>> this is one of those non-intuitive things that can be shown by
>> statistical mathmatics, that is that the concept won't work.  Or
>> rather, it won't work any better than the existing paradigm.
> 
> I think you just just made that up. It clearly depends on the
> circumstances. 

No, it doesn't.

> If two rules correlate strongly in spam and weakly
> correlate or anti-correlate in ham, there's a case for creating a
> meta rule.

That is true only for a mail message that meets a very specific
criteria that matches those rules.  It's going to be overridden by
the law of averages, though.

> In some cases it's possible to create useful meta-rules out
> of rules that aren't worth scoring individually.

I think if you sit down and start trying to define examples
and run them through large databases of spam and ham you
will find that it doesen't work the way you think it does.  That
is what I was talking about when I said that statistical
mathematics has parts that are non-intuitive.

Many people have tried doing exactly this with other kinds of
correlations.  You often see this in sports predictions, for
example - amateurs work out "systems" that look for these kinds of
false causalities and use the results to predict the winner of
the next Super Bowl, for example.  It might even work a few
times - but over the long run they fall down.

Fundamentally, a spam rule is either worth scoring or not.  If
it is completely useless - for example, an anti-spam rule that
assumes that any sender with an e-mail address shorter than 15 
characters is more likely a spammer - then if you analyze the
times the rule triggers with a large volume of both ham and
spam, drawn from a wide disparity of sources, you will find it
triggers equally on both ham and spam.

However if a rule does have some point-value, it's going to
ALWAYS trigger more on one side - either trigger more on the
ham side, or more on the spam side.

If it scores more on the spam side then you calculate the
percentage of scoring and use that to assign a point value.

If it triggers more on the ham side then it's useful because
it can be scored to SUBTRACT from the point score.

The reason you probably think that "meta" rules work better
is because you have created meta rules that are in reality,
a grouping of a useless rules with a useful rule.  Thus, giving
the illusion that "a rule that isn't scoring individually"
actually is scoring when in a meta rule.

Most of the focus in SA has been in the search for the "killer
rule" that will ALWAYS score on the spam side and NEVER score
on the ham side - because naturally, people want to believe that
content filtering is black-and-white and that there's somewhere
an elusive magic "thang" that separates the ham from the
spam.

But in reality, what is happening in the spam war is that as
time passes, the more easily recognizable spam is being eliminated
by the "low-hanging fruit" anti-spam rules that are being added -
and the spammers are adapting, by making their spam look less
and less like spam and more and more like ham.

One of these days the spam will be so indistinguishable from the
ham that the differences will only be detectable by computer
in corpus es of thousands to tens of thousands of pieces of ham and 
spam.  At that time, SA will hopefully be advanced enough to
keep up - because we will be approaching the complexity of
the rulesets used by the human brain to distinguish between
ham and spam.

Fun stuff!

Ted

Re: SA needs a new paradigm for rule structure

Posted by RW <rw...@googlemail.com>.
On Fri, 09 Oct 2009 23:40:01 -0700
Ted Mittelstaedt <te...@ipinc.net> wrote:


> 
> I know that it seems like the idea of building up "meta" rules with
> a lot of small rules will give you a more accurate hit rate, but
> this is one of those non-intuitive things that can be shown by
> statistical mathmatics, that is that the concept won't work.  Or
> rather, it won't work any better than the existing paradigm.

I think you just just made that up. It clearly depends on the
circumstances. If two rules correlate strongly in spam and weakly
correlate or anti-correlate in ham, there's a case for creating a
meta rule. In some cases it's possible to create useful meta-rules out
of rules that aren't worth scoring individually.

Re: SA needs a new paradigm for rule structure

Posted by Mynabbler <my...@live.com>.

Marc Perkel wrote:
> I think you are missing my point. Here's an example.
> 
> Mentions God/Christianity = 0
> Mentions Nigeria = 0
> Mentions Bank = 0
> Mentions Funds = 0
> 
> Mentions all 4 = 100
> 
> This is simplistic but it makes my point.
I think you are missing our point. Your simplistic example translates to:

body   __GOD /\bGod\b/
body   __NIGERIA /\bNigeria\b/
body   __BANK /\bBank\b/
body   __FUNDS /\bFunds\b/
body   __SWIFT /\bSwift response\b/
meta     RAISEFLAG (__GOD + __NIGERIA + __BANK + __FUNDS + __SWIFT >= 4)
describe RAISEFLAG 4 out 5 bad words fround, surely a 419 scam
score    RAISEFLAG 100

__GOD does not score, __NIGERIA neither, etc, 4 out of 5 does, a 100 a per
your request.


-- 
View this message in context: http://www.nabble.com/SA-needs-a-new-paradigm-for-rule-structure-tp25822909p25838064.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: SA needs a new paradigm for rule structure

Posted by Marc Perkel <ma...@perkel.com>.

Ted Mittelstaedt wrote:
> Marc Perkel wrote:
>> I've brought this idea up over the years but I'll try to explain it 
>> in a different way. Maybe we can do this with a lot of meta rules.
>>
>> What we need are rules that combine a lot of simple rules into 
>> concepts and then combine those rules into rules that score - and 
>> score big. As an example, lets take a standard nigerian scam email.
>>
>>  From <> reply to:
>>
>> [I don't know you] Dear stranger, I am mr, ms. mrs. my name is
>>
>> [I am connected] I am a soldier in Iraq, I and the daughter of an 
>> african president, I work at a bank in hong hong
>>
>> [I have money] I have the sum of 56 million dollars USD
>>
>> [the money is hot] no beneficiaries, sneak it out of the country, 
>> oppressive regime
>>
>> [transfer to your account] splitting the funds, wire to your account
>>
>> [i need you information] name, address, account number
>>
>> [i want you to contact me] by email, phone
>>
>> [keep this a secret] confidential discretion
>>
>> So - we create a lot of simple rules with no points with key words 
>> and phases and then combine these rules using meta rules to get these 
>> concepts. That way we have a meta rule like, "they don't know me" 
>> "that are talking about transferring millions" "they want my 
>> information" "they are talking about hot money". Then you combine 
>> those concepts into rules that can definitively determine it is spam.
>>
>> And - I am still looking for someone who might do baysian or some 
>> other automatic system that looks for rule combinations and increases 
>> scores based on that.
>>
>
> I know that it seems like the idea of building up "meta" rules with
> a lot of small rules will give you a more accurate hit rate, but
> this is one of those non-intuitive things that can be shown by
> statistical mathmatics, that is that the concept won't work.  Or
> rather, it won't work any better than the existing paradigm.
>
> In other words, the current system of assigning little points to
> a lot of little rules will yield the same result for any given
> set of spam messages as organizing all
> these small rules into groups that have bigger point values.
>
> The only thing the organization does is for humans to understand
> what is going on better.  This is because how humans think about
> math like statistics is a lot different than how a computer
> works with mathematics like statistics.
>
> Ted
>

I think you are missing my point. Here's an example.

Mentions God/Christianity = 0
Mentions Nigeria = 0
Mentions Bank = 0
Mentions Funds = 0

Mentions all 4 = 100

This is simplistic but it makes my point.


Re: SA needs a new paradigm for rule structure

Posted by Ted Mittelstaedt <te...@ipinc.net>.
Marc Perkel wrote:
> I've brought this idea up over the years but I'll try to explain it in a 
> different way. Maybe we can do this with a lot of meta rules.
> 
> What we need are rules that combine a lot of simple rules into concepts 
> and then combine those rules into rules that score - and score big. As 
> an example, lets take a standard nigerian scam email.
> 
>  From <> reply to:
> 
> [I don't know you] Dear stranger, I am mr, ms. mrs. my name is
> 
> [I am connected] I am a soldier in Iraq, I and the daughter of an 
> african president, I work at a bank in hong hong
> 
> [I have money] I have the sum of 56 million dollars USD
> 
> [the money is hot] no beneficiaries, sneak it out of the country, 
> oppressive regime
> 
> [transfer to your account] splitting the funds, wire to your account
> 
> [i need you information] name, address, account number
> 
> [i want you to contact me] by email, phone
> 
> [keep this a secret] confidential discretion
> 
> So - we create a lot of simple rules with no points with key words and 
> phases and then combine these rules using meta rules to get these 
> concepts. That way we have a meta rule like, "they don't know me" "that 
> are talking about transferring millions" "they want my information" 
> "they are talking about hot money". Then you combine those concepts into 
> rules that can definitively determine it is spam.
> 
> And - I am still looking for someone who might do baysian or some other 
> automatic system that looks for rule combinations and increases scores 
> based on that.
> 

I know that it seems like the idea of building up "meta" rules with
a lot of small rules will give you a more accurate hit rate, but
this is one of those non-intuitive things that can be shown by
statistical mathmatics, that is that the concept won't work.  Or
rather, it won't work any better than the existing paradigm.

In other words, the current system of assigning little points to
a lot of little rules will yield the same result for any given
set of spam messages as organizing all
these small rules into groups that have bigger point values.

The only thing the organization does is for humans to understand
what is going on better.  This is because how humans think about
math like statistics is a lot different than how a computer
works with mathematics like statistics.

Ted

Re: SA needs a new paradigm for rule structure

Posted by John Hardin <jh...@impsec.org>.
On Fri, 9 Oct 2009, John Hardin wrote:

> ... it could trivially be done right now based on the existing evolver 
> if you simply fed it _all_ of the existing rules to use as its base, and 
> (for example) kept every evolved rule set whose fitness was > 100000 (or 
> whatever turns up as a good cutoff point). Culling overlap would be an 
> interesting exercise.
>
> It's an interesting idea, but right now I don't quite have the hardware to 
> try doing it.

Heh. Tried this out and pegged my CPU for a long time without getting even 
one meta out. 3400+ alleles is a _lot_.

I _definitely_ don't have the oomph to do this... :)

It's CPU bound; it only requires about 25MB memory per process, and disk 
space is negligible. This might be really amenable to distributed 
processing.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   We have to realize that people who run the government can and do
   change. Our society and laws must assume that bad people -
   criminals even - will run the government, at least part of the
   time.                                               -- John Gilmore
-----------------------------------------------------------------------
  8 days since a sunspot last seen - EPA blames CO2 emissions

Re: SA needs a new paradigm for rule structure

Posted by John Hardin <jh...@impsec.org>.
On Fri, 9 Oct 2009, Marc Perkel wrote:

> What we need are rules that combine a lot of simple rules into concepts and 
> then combine those rules into rules that score - and score big. As an 
> example, lets take a standard nigerian scam email.
>
> From <> reply to:
>
> [I don't know you] Dear stranger, I am mr, ms. mrs. my name is
>
> [I am connected] I am a soldier in Iraq, I and the daughter of an african 
> president, I work at a bank in hong hong
>
> [I have money] I have the sum of 56 million dollars USD
>
> [the money is hot] no beneficiaries, sneak it out of the country, oppressive 
> regime
>
> [transfer to your account] splitting the funds, wire to your account
>
> [i need you information] name, address, account number
>
> [i want you to contact me] by email, phone
>
> [keep this a secret] confidential discretion
>
> So - we create a lot of simple rules with no points with key words and phases 
> and then combine these rules using meta rules to get these concepts. That way 
> we have a meta rule like, "they don't know me" "that are talking about 
> transferring millions" "they want my information" "they are talking about hot 
> money". Then you combine those concepts into rules that can definitively 
> determine it is spam.
>
> And - I am still looking for someone who might do baysian or some other 
> automatic system that looks for rule combinations and increases scores based 
> on that.

That's exactly what I'm doing right now with the ADVANCE_FEE rules (which 
I did _not_ originate - I'm only freshening them). The structure for 
automatic meta generation is there.

The effort is in generating the subrules and deciding which ones are 
generally related to each other.

The former can't really be automated, but Justin's giving it a shot with 
SOUGHT. It is a lot of work to get broadly good results even for basic 
rules.

The latter would be a good research project; it could trivially be done 
right now based on the existing evolver if you simply fed it _all_ of the 
existing rules to use as its base, and (for example) kept every evolved 
rule set whose fitness was > 100000 (or whatever turns up as a good cutoff 
point). Culling overlap would be an interesting exercise.

It's an interesting idea, but right now I don't quite have the hardware to 
try doing it. Anybody care to order a refurbished 4-core Phenom off 
TigerDirect for me? :) ( <- not serious )

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Judicial Activism (n): interpreting the Constitution to grant the
   government powers that are popularly felt to be "needed" but that
   are not explicitly provided for therein (common definition);
   interpreting the Constitution as it is written (Brady definition)
-----------------------------------------------------------------------
  8 days since a sunspot last seen - EPA blames CO2 emissions