You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Evan Platt <ev...@espphotography.com> on 2006/11/06 22:44:31 UTC

Do something useful with bad addresses?

Running postfix /spamassassin called via procmail..

Can I do something useful with bad address spam, rather than rejecting it?

I don't know if this is a question for a postfix list or a 
spamassassin list, but perhaps automatically report it, and 
automatically learn as spam?

Off the bat, I wouldn't want to do it for every bad e-mail address, 
just a bunch of known bad ones, addresses that somehow got added to 
mailing lists, or the such.

Any suggestions?

Thanks. :)

Evan


Re: Do something useful with bad addresses?

Posted by "John D. Hardin" <jh...@impsec.org>.
On Mon, 6 Nov 2006, Evan Platt wrote:

> Ok, so just to get a second (and third and so on opinion), I created 
> a user on my system, named it spambucket. Aliased the garbage 
> addresses to spambucket in /etc/postfix/aliases, then at midnight run
> 
> sa-learn --spam -u primaryuser --mbox /private/var/mail/spambucket
> 
> (primaryuser is my mail account).
> 
> For postfix is that correct?

No idea, I herd Sendmail cats.

> And then (yes, I know this is more a 
> postfix question), can I simply delete - rm /private/var/mail/spambucket ?

You can, but I like the idea of keeping the corpa around for review
and retraining should it become necessary.

> I think I can string this on one line, ie
> 
> sa-learn --spam -u primaryuser --mbox /private/var/mail/spambucket & 
> rm /private/var/mail/spambucket

I think you want to use && instead of & for that. "&&" is "proceed
only if the preceding step succeeded", where "&" would background the
sa-learn and then immediately get rid of the file you told it to
process.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The difference between ignorance and stupidity is that the stupid
  desire to remain ignorant.                             -- Jim Bacon
-----------------------------------------------------------------------
 Tomorrow: the campaign ads stop


Re: Do something useful with bad addresses?

Posted by Evan Platt <ev...@espphotography.com>.
At 01:55 PM 11/6/2006, you wrote:

>Create the user accounts (which is the simplest way to get it to start
>accepting messages) and run sa-learn --spam against their mailboxes
>from cron.daily.
>
>Alternatively, create a single sa-honeypot user and redirect all your
>targeted bad addresses to that account, and then run sa-learn --spam
>against that one mailbox from cron.daily.


I hate when I miss the simple solutions..

Ok, so just to get a second (and third and so on opinion), I created 
a user on my system, named it spambucket. Aliased the garbage 
addresses to spambucket in /etc/postfix/aliases, then at midnight run

sa-learn --spam -u primaryuser --mbox /private/var/mail/spambucket

(primaryuser is my mail account).

For postfix is that correct? And then (yes, I know this is more a 
postfix question), can I simply delete - rm /private/var/mail/spambucket ?

I think I can string this on one line, ie

sa-learn --spam -u primaryuser --mbox /private/var/mail/spambucket & 
rm /private/var/mail/spambucket

Evan 


Re: Do something useful with bad addresses?

Posted by John Andersen <js...@pen.homeip.net>.
On Monday 06 November 2006 21:50, John Rudd wrote:
> And, I have in fact seen misses that had VERY low bayes scores (BAYES_00).

With no more info about the content of said misses it would be hard to say
your bayes was poisoned. 

It would be even harder to see how spam would poison bayes to MISS things.
Historically the idea of poison was to make bayes useless, and un-trustworthy 
by causing it to generate too many false positives.  That essentially hasn't
worked out too well.

Its not too hard to imagine that sending spam to linux users that pretends
to deal with issues pertaining to linux, but slipping a couple lines about
<insert spam topic> might sneak by.  But this hardly fits my definition of 
bayes poison.



-- 
_____________________________________
John Andersen

Re: Do something useful with bad addresses?

Posted by John Rudd <jr...@ucsc.edu>.
John Andersen wrote:
> On Monday 06 November 2006 15:02, Derek Harding wrote:
>>  I suspect that the large amount of image spam that
>> contains just bayes poison poisoned my bayes DB.
> 
> Oh, I don't know...
> I think the concept of bayes poisoning has been pretty well
> debunked.  Judging by the absence of false positives with high
> bayes scores, it seems like a theory looking for a proof.
> 

That's not the only consequence of a theoretical bayes poisoning.  They 
might not be at all trying to create false positives.  What they want is 
to create misses (false negatives).

And, I have in fact seen misses that had VERY low bayes scores (BAYES_00).



Re: Do something useful with bad addresses?

Posted by Evan Platt <ev...@espphotography.com>.
At 09:24 PM 11/6/2006, you wrote:
>Oh, I don't know...
>I think the concept of bayes poisoning has been pretty well
>debunked.  Judging by the absence of false positives with high
>bayes scores, it seems like a theory looking for a proof.

Where's Jamie Hyneman and Adam Savage when you need them? 


Re: Do something useful with bad addresses?

Posted by John Andersen <js...@pen.homeip.net>.
On Monday 06 November 2006 15:02, Derek Harding wrote:
>  I suspect that the large amount of image spam that
> contains just bayes poison poisoned my bayes DB.

Oh, I don't know...
I think the concept of bayes poisoning has been pretty well
debunked.  Judging by the absence of false positives with high
bayes scores, it seems like a theory looking for a proof.

-- 
_____________________________________
John Andersen

Re: Do something useful with bad addresses?

Posted by Kelson <ke...@speed.net>.
John Rudd wrote:
> I had a similar problem.  I don't divert unknown addresses to salearn, 
> but if I don't fish a message out of my spam folder within X days, it 
> gets automatically sent to sa-learn and awl.
> 
> Then, last week, I started seeing BAYES_00 on messages that would have 
> otherwise been scored as spam.  I responded by removing the negative 
> values for low bayes probabilities.

Wait, how does training random text as spam indicators result in Bayes 
thinking that text indicates ham?

At worst, I can see it diluting the Bayes scores for strong indicators, 
resulting in more hits close to BAYES_50.  But to trigger BAYES_00 means 
that you have to have trained something similar *as ham*.

I expect this has less to do with automated training (since, by your 
description, we're not talking auto-learn, so it won't learn anything in 
that folder as ham) and more to do with a new type of spam that 
simulates real mail more effectively, or that manages to get 
auto-learned in the initial SA process (if you have auto-learn enabled).

-- 
Kelson Vibber
SpeedGate Communications <www.speed.net>

Re: Do something useful with bad addresses?

Posted by John Rudd <jr...@ucsc.edu>.
Derek Harding wrote:
> On Mon, 2006-11-06 at 15:49 -0800, Evan Platt wrote:
>> Just to clarify.. Yes, these would definitely be targeted addresses. 
>> I grepped my mail log for common spam I get and I'll add those.
> 
> There is one problem with this approach though. All spam is learned. I
> tried this and I suspect that the large amount of image spam that
> contains just bayes poison poisoned my bayes DB. I'm not certain this is
> the case but of late bayes is working exceedingly poorly for me and it
> used to work exceedingly well.
> 

I had a similar problem.  I don't divert unknown addresses to salearn, 
but if I don't fish a message out of my spam folder within X days, it 
gets automatically sent to sa-learn and awl.

Then, last week, I started seeing BAYES_00 on messages that would have 
otherwise been scored as spam.  I responded by removing the negative 
values for low bayes probabilities.

Haven't noticed any ham in my spam folder yet, so I'm ok with it right now.

Re: Do something useful with bad addresses?

Posted by "John D. Hardin" <jh...@impsec.org>.
On Mon, 6 Nov 2006, Derek Harding wrote:

> On Mon, 2006-11-06 at 15:49 -0800, Evan Platt wrote:
> > Just to clarify.. Yes, these would definitely be targeted addresses. 
> > I grepped my mail log for common spam I get and I'll add those.
> 
> There is one problem with this approach though. All spam is learned. I
> tried this and I suspect that the large amount of image spam that
> contains just bayes poison poisoned my bayes DB. I'm not certain this is
> the case but of late bayes is working exceedingly poorly for me and it
> used to work exceedingly well.

+1

I hand-train, and it is not at all surprising for me to see BAYES_00
on image spams (but only on image spams).

I am VERY glad I don't trust autolearning.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The difference between ignorance and stupidity is that the stupid
  desire to remain ignorant.                             -- Jim Bacon
-----------------------------------------------------------------------
 Tomorrow: the campaign ads stop


Re: Do something useful with bad addresses?

Posted by Derek Harding <de...@innovyx.com>.
On Mon, 2006-11-06 at 15:49 -0800, Evan Platt wrote:
> Just to clarify.. Yes, these would definitely be targeted addresses. 
> I grepped my mail log for common spam I get and I'll add those.

There is one problem with this approach though. All spam is learned. I
tried this and I suspect that the large amount of image spam that
contains just bayes poison poisoned my bayes DB. I'm not certain this is
the case but of late bayes is working exceedingly poorly for me and it
used to work exceedingly well.

Derek



Re: Do something useful with bad addresses?

Posted by Evan Platt <ev...@espphotography.com>.
At 03:15 PM 11/6/2006, you wrote:

>>Did you miss the word "targeted"? I took the original poster to mean
>>forward all the bad addresses _that you choose_ to sa-honeypot.
>>Thus your scenario is only a problem if you choose to forward an address
>>that's close to an actual address.
>
>Oh, yeah, I did read past "targeted".  Sorry :-}

Just to clarify.. Yes, these would definitely be targeted addresses. 
I grepped my mail log for common spam I get and I'll add those.

A few are addresses I have no idea how they were picked up, and a few 
are obvious from submitted spam sites, ie Free Ipod sites, etc.

Plus, this is my own domain, I'm the only user, so I don't have an IT 
person to blame when things go wrong, but on the flip side, I don't 
have a angry CEO to answer to when things go wrong either.. .:) 


Re: Do something useful with bad addresses?

Posted by John Rudd <jr...@ucsc.edu>.
Derek Harding wrote:
> On Mon, 2006-11-06 at 14:34 -0800, John Rudd wrote:
>>> Alternatively, create a single sa-honeypot user and redirect all your
>>> targeted bad addresses to that account, and then run sa-learn --spam
>>> against that one mailbox from cron.daily.
>>>
>> And then an important email comes into the organization, with a single 
>> pair of characters transposed, and thus not only disappears, but gets 
>> learned as spam.  Then everyone in IT draws straws to see who gets to 
>> fall on their sword.
> 
> Did you miss the word "targeted"? I took the original poster to mean
> forward all the bad addresses _that you choose_ to sa-honeypot. 
> 
> Thus your scenario is only a problem if you choose to forward an address
> that's close to an actual address.
> 
>

Oh, yeah, I did read past "targeted".  Sorry :-}

Re: Do something useful with bad addresses?

Posted by Derek Harding <de...@innovyx.com>.
On Mon, 2006-11-06 at 14:34 -0800, John Rudd wrote:
> > 
> > Alternatively, create a single sa-honeypot user and redirect all your
> > targeted bad addresses to that account, and then run sa-learn --spam
> > against that one mailbox from cron.daily.
> > 
> 
> And then an important email comes into the organization, with a single 
> pair of characters transposed, and thus not only disappears, but gets 
> learned as spam.  Then everyone in IT draws straws to see who gets to 
> fall on their sword.

Did you miss the word "targeted"? I took the original poster to mean
forward all the bad addresses _that you choose_ to sa-honeypot. 

Thus your scenario is only a problem if you choose to forward an address
that's close to an actual address.

Derek


Re: Do something useful with bad addresses?

Posted by John Rudd <jr...@ucsc.edu>.
John D. Hardin wrote:
> On Mon, 6 Nov 2006, Evan Platt wrote:
> 
>> Off the bat, I wouldn't want to do it for every bad e-mail address, 
>> just a bunch of known bad ones, addresses that somehow got added to 
>> mailing lists, or the such.
>>
>> Any suggestions?
> 
> Create the user accounts (which is the simplest way to get it to start
> accepting messages) and run sa-learn --spam against their mailboxes
> from cron.daily.
> 
> Alternatively, create a single sa-honeypot user and redirect all your
> targeted bad addresses to that account, and then run sa-learn --spam
> against that one mailbox from cron.daily.
> 

And then an important email comes into the organization, with a single 
pair of characters transposed, and thus not only disappears, but gets 
learned as spam.  Then everyone in IT draws straws to see who gets to 
fall on their sword.


Re: Do something useful with bad addresses?

Posted by "John D. Hardin" <jh...@impsec.org>.
On Mon, 6 Nov 2006, Evan Platt wrote:

> Off the bat, I wouldn't want to do it for every bad e-mail address, 
> just a bunch of known bad ones, addresses that somehow got added to 
> mailing lists, or the such.
> 
> Any suggestions?

Create the user accounts (which is the simplest way to get it to start
accepting messages) and run sa-learn --spam against their mailboxes
from cron.daily.

Alternatively, create a single sa-honeypot user and redirect all your
targeted bad addresses to that account, and then run sa-learn --spam
against that one mailbox from cron.daily.

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The difference between ignorance and stupidity is that the stupid
  desire to remain ignorant.                             -- Jim Bacon
-----------------------------------------------------------------------
 Tomorrow: the campaign ads stop