You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by joea <jo...@j4computers.com> on 2012/04/13 00:23:30 UTC
auto add spam/ham for manual learning
I see where one can forward mail to location and have spamassassin scan them. For both spam and ham I gather.
Wondering what settings to change to have it ignore any additional info the forward adds. Yes, I am looking for a bit of spoon feeding here, as, well, it's been along day.
Re: auto add spam/ham for manual learning
Posted by John Hardin <jh...@impsec.org>.
On Fri, 13 Apr 2012, Kris Deugau wrote:
> John Hardin wrote:
>> The best you can do if you're doing forwarding for training, is to
>> require that the original ham/spam be forwarded as an RFC822 attachment,
>
> Outlook is not consistently capable of doing this correctly - worse, the
> behaviour changes depending on whether you're forwarding one message ore
> more than one. O_o
YGBFKM. But then, it's Outhouse, so I shouldn't be surprised.
> Either way, I wouldn't recommend auto-training on user-submitted mail unless
> you have a small userbase you can talk to individually, in person.
...yeah, there's that, too. "Your users are out to destroy your network"
is a healthy admin attitude, not paranioa. :)
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
A well educated Electorate, being necessary to the liberty of a
free State, the Right of the People to Keep and Read Books,
shall not be infringed.
-----------------------------------------------------------------------
Today: Thomas Jefferson's 269th Birthday
Re: auto add spam/ham for manual learning
Posted by Kris Deugau <kd...@vianet.ca>.
John Hardin wrote:
> The best you can do if you're doing forwarding for training, is to
> require that the original ham/spam be forwarded as an RFC822 attachment,
Some users with sane mail clients *can* be trained to do this - you just
have to find the right instructions.
If they're all using Outlook, set up shared folders. Outlook is not
consistently capable of doing this correctly - worse, the behaviour
changes depending on whether you're forwarding one message ore more than
one. O_o
> and then have a mailbox preprocessing step that extracts the attachments
> and saves them in the "real" learning mail folders. This doesn't
> guarantee no lost or altered data, but it will minimize it. It's also
> possibly more technically advanced than your users will be comfortable
> with.
Another handy way to make this happen is to install a webmail suite that
supports a "Report spam" option. This lets users Do The Right Thing
without having to do it by hand.
> If your mail server supports it, a better way is to define a couple of
> public/shared mail folders (e.g. public IMAP folders) and have people
> move missed spams and copy misclassified hams from their private folders
> to those public folders. This will avoid the changes made to the message
> from being forwarded. Then train from those folders.
Either way, I wouldn't recommend auto-training on user-submitted mail
unless you have a small userbase you can talk to individually, in
person. I have a handful of customers who don't seem to read the
replies I send occasionally responding to their reporting their
all-legitimate inbox as spam - I've regularly seen those same replies
reported as spam. :(
-kgd
Re: auto add spam/ham for manual learning
Posted by John Hardin <jh...@impsec.org>.
On Thu, 12 Apr 2012, joea wrote:
> I see where one can forward mail to location and have spamassassin scan them. For both spam and ham I gather.
>
> Wondering what settings to change to have it ignore any additional info the forward adds. Yes, I am looking for a bit of spoon feeding here, as, well, it's been along day.
Forwards are unavoidably mangled and/or abridged as part of the forwarding
process. There are no settings to undo that.
The best you can do if you're doing forwarding for training, is to require
that the original ham/spam be forwarded as an RFC822 attachment, and then
have a mailbox preprocessing step that extracts the attachments and saves
them in the "real" learning mail folders. This doesn't guarantee no lost
or altered data, but it will minimize it. It's also possibly more
technically advanced than your users will be comfortable with.
If your mail server supports it, a better way is to define a couple of
public/shared mail folders (e.g. public IMAP folders) and have people move
missed spams and copy misclassified hams from their private folders to
those public folders. This will avoid the changes made to the message from
being forwarded. Then train from those folders.
If you do that, you probably don't want to leave messages sitting in the
shared ham folder; move them to a private-to-admin folder.
Either way, I don't recommend deleting any messages that were submitted
for training - in other words, keep your training corpora. Having your
training corpora allows you to troubleshoot and correct accidental (or
malicious) mistraining, and allows you to wipe and retrain from scratch if
you need to (e.g. if you lose your entire Bayes database for some reason).
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Gun Control laws cannot reduce violent crime, because gun control
laws focus obsessively on a tool a criminal might use to commit a
crime rather than the criminal himself and his act of violence.
-----------------------------------------------------------------------
Tomorrow: Thomas Jefferson's 269th Birthday