You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Marc Perkel <ma...@perkel.com> on 2009/12/16 16:03:57 UTC
Whitelists in SA
Res wrote:
>
> no whitelist should ever become default part of SA
>
> the day it is, is the day I look elsewhere.
>
>
Why shouldn't white lists become part of SA? Blacklists are part of SA.
My hostkarma whitelists are one of the things that keeps me in business
because my false positive rates are far far better than SA because of
white listing. There are millions of email servers out there that do
nothing but send good email 100% of the time that are easy to detect
because, unlike spammers, they aren't trying to be evasive. I continue
to be of the opinion that SA need more white rules to detect HAM and not
just SPAM.
Re: Whitelists in SA
Posted by John Hardin <jh...@impsec.org>.
On Sun, 20 Dec 2009, jdow wrote:
> I'm just a touch naive here; but, it seems to me it should be possible,
> somehow, to build running spamd daemons, one with the regular rules
> and one with the mass check rules.
There's nothing special about "masscheck rules". Masscheck is just running
the current ruleset against hand-classified corpora (ideally _large_
hand-classified corpora) to see what hits.
> The second one is fed the email in parallel with the first but deletes
> the mail once the scores are logged.
This can easily be done by analysis of spamd logs. It logs all the rules
hit on every message scanned.
> The downside is that this is not "confirmed ham" and "confirmed spam".
That unfortunately is the critical part. You can easily glean whether or
not SA thinks a message is "spammy" and what rules led to that
classification, the tough part is confirming whether or not it's _right_.
> I wonder how much companies would pay for a part time SpamAssassin
> honcho who can be trusted (bonded?) and can write SARE-ish rules
> tailored to the company's email. Is there a job opportunity for
> somebody here?
I'd do that.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
"Bother," said Pooh as he struggled with /etc/sendmail.cf, "it never
does quite what I want. I wish Christopher Robin was here."
-- Peter da Silva in a.s.r
-----------------------------------------------------------------------
5 days until Christmas
Re: [sa] Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Sun, 20 Dec 2009, jdow wrote:
> The downside is that this is not "confirmed ham" and "confirmed spam".
(nod) Exactly. And that is what is needed to do a masscheck...
> I wonder how much companies would pay for a part time SpamAssassin
> honcho who can be trusted (bonded?) and can write SARE-ish rules
> tailored to the company's email. Is there a job opportunity for somebody
> here? (And, yes, I do suspect the burnout time would be rather short.)
(smile) I've got my own custom rule file format (plus a script to convert
to standard SA rules format). This reduces the effort to add a new rule
pretty much down to a cut-n-paste operation. Must admit there are some
days when I do feel a bit burned out, but generally I am gratified to see
my new rules trigger on the remainder of a spam flood.... :)
As for trust, I never need to see the ham, just the spam, which has no
privacy issues (smile).
- C
Re: Whitelists in SA
Posted by Warren Togami <wt...@redhat.com>.
On 12/20/2009 09:20 AM, Charles Gregory wrote:
> On Sat, 19 Dec 2009, Daryl C. W. O'Shea wrote:
>>> More unfortunately, privacy concerns prevent me from building a useful
>>> corpus of ham. Sigh....
>>> But otherwise such a good idea....
>> Can you not trust yourself to use your own ham? You don't need to
>> provide us with your mail. You can scan your own mail locally on your
>> own machine(s).
>
> I run an ISP. The corpus I would so love to build is the hundreds of
> messages per day that all our clients receive. It's *their* privacy that
> is the cocern.
Right, they would need to opt-in and the manual sorting requirements are
a bit too difficult and time consuming for all but the most dedicated to
this cause.
>
> Do you think that my own private collection of saved mail (perhaps 1100
> ham) would really be of benefit? I'd have to start saving my spam as
> well....
A Ham-only corpus is still useful, as long as it contains mail from a
variety of sources. (Mailing lists are not very useful.)
>
> And it would always be skewed by the fact that I SMTP reject anything
> caught by Zen.
>
Not a problem.
Warren
Re: Whitelists in SA
Posted by John Hardin <jh...@impsec.org>.
On Fri, 18 Dec 2009, Charles Gregory wrote:
> On Thu, 17 Dec 2009, jdow wrote:
>> It is a good thing this issue was raised. It led to appropriate mass
>> check runs. I expect that will lead to saner scoring within the SA
>> framework. If not and it bites me, THEN I'll raise the issue again.
>> Does that seem fair?
>
> 50_scores.cf:score HABEAS_ACCREDITED_COI 0 -8.0 0 -8.0
> 50_scores.cf:score HABEAS_ACCREDITED_SOI 0 -4.3 0 -4.3
> 50_scores.cf:score HABEAS_CHECKED 0 -0.2 0 -0.2
>
> Still no changes through the sa-update channel.
There won't be until after 3.3.0 ships. Then changes to 3.2.x (including a
possible 3.2.6 release) will be considered.
As far as I know rule promotion and rescoring are not automatic for 3.2.x,
it's still a manual process. All of the focus right now is on getting
3.3.0 out.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
"Bother," said Pooh as he struggled with /etc/sendmail.cf, "it never
does quite what I want. I wish Christopher Robin was here."
-- Peter da Silva in a.s.r
-----------------------------------------------------------------------
7 days until Christmas
Re: [sa] Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Fri, 18 Dec 2009, jdow wrote:
>> Perhaps you meant CHAIR and keyboard? ;)
> I should have guessed you've managed to short circuit the path
> through your brain.
> {O,o} <-- Grinning, ducking, and running REAL fast that way>>>>>>>>>
> (Thanks for the straight line. {^_-})
(Thinks twice about it)
Ouch. Subtle. I like it. :)
- Charles
Re: [sa] Re: Whitelists in SA
Posted by jdow <jd...@earthlink.net>.
From: "Charles Gregory" <cg...@hwcn.org>
Sent: Friday, 2009/December/18 13:49
> On Fri, 18 Dec 2009, jdow wrote:
>>> On Thu, 17 Dec 2009, jdow wrote:
>>> Still no changes through the sa-update channel.
>>> Is there a time delay in the masscheck results being applied?
>> Yes, there is, Mr. Gregory. It exists between your monitor and your
>> keyboard.
>
> There is a one inch gap between those two.
>
> Perhaps you meant CHAIR and keyboard? ;)
I should have guessed you've managed to short circuit the path
through your brain.
{O,o} <-- Grinning, ducking, and running REAL fast that way>>>>>>>>>
(Thanks for the straight line. {^_-})
Re: [sa] Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Fri, 18 Dec 2009, jdow wrote:
>> On Thu, 17 Dec 2009, jdow wrote:
>> Still no changes through the sa-update channel.
>> Is there a time delay in the masscheck results being applied?
> Yes, there is, Mr. Gregory. It exists between your monitor and your
> keyboard.
There is a one inch gap between those two.
Perhaps you meant CHAIR and keyboard? ;)
- C
Re: Whitelists in SA
Posted by jdow <jd...@earthlink.net>.
From: "Charles Gregory" <cg...@hwcn.org>
Sent: Sunday, 2009/December/20 06:20
> On Sat, 19 Dec 2009, Daryl C. W. O'Shea wrote:
>>> More unfortunately, privacy concerns prevent me from building a useful
>>> corpus of ham. Sigh....
>>> But otherwise such a good idea....
>> Can you not trust yourself to use your own ham? You don't need to
>> provide us with your mail. You can scan your own mail locally on your
>> own machine(s).
>
> I run an ISP. The corpus I would so love to build is the hundreds of
> messages per day that all our clients receive. It's *their* privacy that
> is the cocern.
>
> Do you think that my own private collection of saved mail (perhaps 1100
> ham) would really be of benefit? I'd have to start saving my spam as
> well....
>
> And it would always be skewed by the fact that I SMTP reject anything
> caught by Zen.
I'm just a touch naive here; but, it seems to me it should be possible,
somehow, to build running spamd daemons, one with the regular rules
and one with the mass check rules. The second one is fed the email in
parallel with the first but deletes the mail once the scores are logged.
The downside is that this is not "confirmed ham" and "confirmed spam".
It is a way to safely test new rule sets, though.
I must admit that the vast majority of email I receive is not hand
checked for ham/spam. I simply read headers on several lists to see
what the current buzz is. I read threads that look interesting and
toss the rest. So it'd be hard to mass check validly with that as a
corpus. (Besides, I suspect animal husbandry companies would hardly
be interested in passing things that look like typical LKML mailings,
would they?)
I wonder how much companies would pay for a part time SpamAssassin
honcho who can be trusted (bonded?) and can write SARE-ish rules
tailored to the company's email. Is there a job opportunity for
somebody here? (And, yes, I do suspect the burnout time would be
rather short.)
{^_^}
Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Sat, 19 Dec 2009, Daryl C. W. O'Shea wrote:
>> More unfortunately, privacy concerns prevent me from building a useful
>> corpus of ham. Sigh....
>> But otherwise such a good idea....
> Can you not trust yourself to use your own ham? You don't need to
> provide us with your mail. You can scan your own mail locally on your
> own machine(s).
I run an ISP. The corpus I would so love to build is the hundreds of
messages per day that all our clients receive. It's *their* privacy that
is the cocern.
Do you think that my own private collection of saved mail (perhaps 1100
ham) would really be of benefit? I'd have to start saving my spam as
well....
And it would always be skewed by the fact that I SMTP reject anything
caught by Zen.
- C
Re: Whitelists in SA
Posted by jdow <jd...@earthlink.net>.
From: "Charles Gregory" <cg...@hwcn.org>
Sent: Friday, 2009/December/18 06:56
> On Thu, 17 Dec 2009, jdow wrote:
>> It is a good thing this issue was raised. It led to appropriate mass
>> check runs. I expect that will lead to saner scoring within the SA
>> framework. If not and it bites me, THEN I'll raise the issue again.
>> Does that seem fair?
>
> 50_scores.cf:score HABEAS_ACCREDITED_COI 0 -8.0 0 -8.0
> 50_scores.cf:score HABEAS_ACCREDITED_SOI 0 -4.3 0 -4.3
> 50_scores.cf:score HABEAS_CHECKED 0 -0.2 0 -0.2
>
> Still no changes through the sa-update channel.
> Is there a time delay in the masscheck results being applied?
>
> - Charles
Yes, there is, Mr. Gregory. It exists between your monitor and your
keyboard.
{^_^}
Re: [sa] Re: Whitelists in SA
Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
On 19/12/2009 5:51 PM, Charles Gregory wrote:
> On Fri, 18 Dec 2009, Warren Togami wrote:
>> Why wait, when you do relatively simple things to help make it happen?
>> http://wiki.apache.org/spamassassin/NightlyMassCheck
>> We can more frequently update rules if more people participate in the
>> nightly masschecks. The current documentation is a bit of a confusing
>> mess unfortunately.
>
> More unfortunately, privacy concerns prevent me from building a useful
> corpus of ham. Sigh....
>
> But otherwise such a good idea....
Can you not trust yourself to use your own ham? You don't need to
provide us with your mail. You can scan your own mail locally on your
own machine(s).
Daryl
Re: [sa] Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Fri, 18 Dec 2009, Warren Togami wrote:
> Why wait, when you do relatively simple things to help make it happen?
> http://wiki.apache.org/spamassassin/NightlyMassCheck
> We can more frequently update rules if more people participate in the
> nightly masschecks. The current documentation is a bit of a confusing mess
> unfortunately.
More unfortunately, privacy concerns prevent me from building a useful
corpus of ham. Sigh....
But otherwise such a good idea....
- C
Re: [sa] Re: Whitelists in SA
Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
On 18/12/2009 5:13 PM, Warren Togami wrote:
> On 12/18/2009 04:56 PM, Charles Gregory wrote:
>> On Fri, 18 Dec 2009, John Hardin wrote:
>>> We hope to get rule scoring and publication much more automated -
>>> i.e., if a rule in the sandbox works well based on the automated
>>> masschecks, it would be automatically scored and published via
>>> sa-update.
>>
>> Music to my ears. I will wait (semi-)patiently. Thanks.
>>
>> - C
>
> Why wait, when you do relatively simple things to help make it happen?
>
> http://wiki.apache.org/spamassassin/NightlyMassCheck
> We can more frequently update rules if more people participate in the
> nightly masschecks. The current documentation is a bit of a confusing
> mess unfortunately.
Exactly! We have code to do this now. But I'm positive that we don't
have a large and diverse enough ham corpus (on a daily basis, not the
big turn out for the "legacy" re-score mass-checks) to trust it.
Contributors are always welcome!
Daryl
Re: [sa] Re: Whitelists in SA
Posted by Warren Togami <wt...@redhat.com>.
On 12/18/2009 04:56 PM, Charles Gregory wrote:
> On Fri, 18 Dec 2009, John Hardin wrote:
>> We hope to get rule scoring and publication much more automated -
>> i.e., if a rule in the sandbox works well based on the automated
>> masschecks, it would be automatically scored and published via sa-update.
>
> Music to my ears. I will wait (semi-)patiently. Thanks.
>
> - C
Why wait, when you do relatively simple things to help make it happen?
http://wiki.apache.org/spamassassin/NightlyMassCheck
We can more frequently update rules if more people participate in the
nightly masschecks. The current documentation is a bit of a confusing
mess unfortunately.
Warren
Re: [sa] Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Fri, 18 Dec 2009, John Hardin wrote:
> We hope to get rule scoring and publication much more automated - i.e.,
> if a rule in the sandbox works well based on the automated masschecks,
> it would be automatically scored and published via sa-update.
Music to my ears. I will wait (semi-)patiently. Thanks.
- C
Re: [sa] Re: Whitelists in SA
Posted by John Hardin <jh...@impsec.org>.
On Fri, 18 Dec 2009, Charles Gregory wrote:
> I recognize, from the existence of such sites as 'rules du jour' that it
> has long been a practice for SA to release 'core' rule updates very
> infrequently. But with respect, I question whether that is still a good
> practice, particularly when an 'issue' raises concern over a particular
> set of scores, and it would *appear* that these updates require
> relatively little effort.
We hope to get rule scoring and publication much more automated - i.e., if
a rule in the sandbox works well based on the automated masschecks, it
would be automatically scored and published via sa-update.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
"Bother," said Pooh as he struggled with /etc/sendmail.cf, "it never
does quite what I want. I wish Christopher Robin was here."
-- Peter da Silva in a.s.r
-----------------------------------------------------------------------
7 days until Christmas
Re: [sa] Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Fri, 18 Dec 2009, LuKreme wrote:
> It's already been stayed no changes to 3.2.5 will be made until 3.3 is
> done, hasn't it?
Well, at this point, I respectfully bow, and take a step back, so as not
to sound too demanding of our great volunteers (smile), but I believe
in another of my posts I put forward the idea that design, testnig and
implementation of rules should be a bit more 'frequent', drawing upon
the model of ClamAV, with signatures being frequently released, even
while the next major 'engine' update is in the works.
I recognize, from the existence of such sites as 'rules du jour' that it
has long been a practice for SA to release 'core' rule updates very
infrequently. But with respect, I question whether that is still a good
practice, particularly when an 'issue' raises concern over a particular
set of scores, and it would *appear* that these updates require relatively
little effort.
So, to put it bluntly, I don't see how a couple of rules changes are
worthy of being 'held back' by the entire push to SA 3.3..... I would
think that a few quick adjustments, and presumably a 'masscheck' would
suffice, and new/revised rules could be released at least on a monthly
basis without any serious concern for compromising the overall score
balance that is the critical goal of SA updates?
Or am I grossly mis-estimating the work-load? :)
- C
Re: Whitelists in SA
Posted by LuKreme <kr...@kreme.com>.
On Dec 18, 2009, at 7:56, Charles Gregory <cg...@hwcn.org> wrote:
> Still no changes through the sa-update channel.
> Is there a time delay in the masscheck results being applied?
It's already been stayed no changes to 3.2.5 will be made until 3.3 is
done, hasn't it?
Re: Whitelists in SA
Posted by Charles Gregory <cg...@hwcn.org>.
On Thu, 17 Dec 2009, jdow wrote:
> It is a good thing this issue was raised. It led to appropriate mass
> check runs. I expect that will lead to saner scoring within the SA
> framework. If not and it bites me, THEN I'll raise the issue again.
> Does that seem fair?
50_scores.cf:score HABEAS_ACCREDITED_COI 0 -8.0 0 -8.0
50_scores.cf:score HABEAS_ACCREDITED_SOI 0 -4.3 0 -4.3
50_scores.cf:score HABEAS_CHECKED 0 -0.2 0 -0.2
Still no changes through the sa-update channel.
Is there a time delay in the masscheck results being applied?
- Charles
Re: Whitelists in SA
Posted by jdow <jd...@earthlink.net>.
From: "J.D. Falk" <jd...@cybernothing.org>
Sent: Thursday, 2009/December/17 11:21
On Dec 16, 2009, at 8:35 AM, LuKreme wrote:
> The fact is I *AM* their customer. The people writing them checks are not,
> they're just their funders. Whitelist companies ha to convince admins to
> use their list. The only way to do that is to have really really really
> high quality lists that really do prevent spam delivery. If I don't use
> their whitelist, and others don't use their whitelist, then their model
> falls apart and they don't make money
Exactly what Return path has been saying (and acting upon) for years.
(We could debate whether Habeas followed that rule before we bought the
company, but it's impolite to speak ill of the dead.)
> but no company is enlightened enough to realise this.
Heh.
<<jdow Lukreme seems to not have much of an engineering education
and zero experience with statistics. It is statistically impossible
to remove all spam perfectly and let all ham through perfectly. Perfect
is a goal you can never reach. If you obsess about it, you will find
yourself "round the bend" before long. All you can do is adjust the
ratio of missed ham to missed spam one way or the other. Where you
"slice" is pretty much up to you. What is the cost, the real cost in
lost customers or dollars spent, for a missed ham and for a missed
spam. If you can hit that balance point for minimum overall cost you've
done your job. If you sit and bitch about something not being perfect,
then you're not doing your job.
It is a good thing this issue was raised. It led to appropriate mass
check runs. I expect that will lead to saner scoring within the SA
framework. If not and it bites me, THEN I'll raise the issue again.
Does that seem fair?
{^_^}
Re: Whitelists in SA
Posted by "J.D. Falk" <jd...@cybernothing.org>.
On Dec 16, 2009, at 8:35 AM, LuKreme wrote:
> The fact is I *AM* their customer. The people writing them checks are not, they're just their funders. Whitelist companies ha to convince admins to use their list. The only way to do that is to have really really really high quality lists that really do prevent spam delivery. If I don't use their whitelist, and others don't use their whitelist, then their model falls apart and they don't make money
Exactly what Return path has been saying (and acting upon) for years.
(We could debate whether Habeas followed that rule before we bought the company, but it's impolite to speak ill of the dead.)
> but no company is enlightened enough to realise this.
Heh.
--
J.D. Falk <jd...@returnpath.net>
Return Path Inc
Re: Whitelists in SA
Posted by LuKreme <kr...@kreme.com>.
On 16-Dec-2009, at 08:03, Marc Perkel wrote:
> Res wrote:
>>
>> no whitelist should ever become default part of SA
>>
>> the day it is, is the day I look elsewhere.
>
> Why shouldn't white lists become part of SA? Blacklists are part of SA. My hostkarma whitelists are one of the things that keeps me in business because my false positive rates are far far better than SA because of white listing. There are millions of email servers out there that do nothing but send good email 100% of the time that are easy to detect because, unlike spammers, they aren't trying to be evasive. I continue to be of the opinion that SA need more white rules to detect HAM and not just SPAM.
I would say that no COMMERCIAL whitelist should be part of SA. I use whitelisting myself, but I'm not going to trust someone who was a financial interest in getting mail delivered to me to be diligent in their whitelisting. After all, their bean-counters don't see me as the customer because I'm not writing them checks.
The fact is I *AM* their customer. The people writing them checks are not, they're just their funders. Whitelist companies ha to convince admins to use their list. The only way to do that is to have really really really high quality lists that really do prevent spam delivery. If I don't use their whitelist, and others don't use their whitelist, then their model falls apart and they don't make money, but no company is enlightened enough to realise this.
--
They say only the good die young. If it works the other way too
I'm immortal