You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by sr...@abit.de on 2005/11/23 12:04:38 UTC

Using sa-learn with Notes/Domino Servers via agents

Hi list,

I have the following setup:

2 Exim servers as incoming and outgoing relay in the DMZ using SA to tag 
messages.
They deliver messages to 2 Domino servers in the DMZ, which then route the 
messages
to the central Domino server for further routing.

I recently had to delete the Bayes DB because autolearn made a lot of 
wrong Bayes
entries which led to many Spams getting more and more negative Bayes 
scores - not
tagging them anymore. Now I want to disable autolearn and feed the Bayes 
DB
manually, sadly it's not that easy in my current setup as Users cannot 
reach the SA servers
in any way.

What I want is to use the collected Spam and especially wrongly tagged Ham 
from a
Notes DB to feed it into sa-learn. Most likely by using an agent to 
automatically sort
mail in the DB and sending it to a special email-account on the Exim 
relays which then
pipes the Mail to sa-learn.

Before I start re-inventing the wheel - did anyone ever do something like 
that before?
What I basically need is a Notes agent that is capable of mailing DB 
entries (aka Mails)
in the correct format to another email-account for piping them into 
sa-learn.

If nothing is known about that particular problem, I'd take any hints 
about how to get
it to work - as in: what's the best way to set up the mail sent to 
sa-learn? I read in the
docs that you can attach the spam/ham mail in the mail when sending to a 
sa-learn
pipe - but sadly it isn't mentioned how such an attached mail should look 
like?
Should the attachement have a special name?
Does the mail need to have a special markup to be recognized by sa-learn 
so it knows
it needs to look into the attachement for the actual spam/ham?


Another issue I have is that we have 2 loadbalanced exim servers for 
tagging spam,
yet I would like to keep the bayes DB the same on both hosts. Did anyone 
ever come
up with a solution to this problem?

Any help would be appreciated,

regards
        sash

Re: Using sa-learn with Notes/Domino Servers via agents

Posted by Paolo Cravero as2594 <pc...@as2594.net>.
Not a solution but a few thoughts since we have LN here as well.

Domino servers add a hell of headers to email messages that might 
confuse the Bayesian engine.

Forwarding internet mail from one LN account to another DESTROYS RFC2822 
headers. Copying preserves.

LN clients can access IMAP mailboxes (sort-of undocumented hidden 
feature). sa-learn can be fed through a call from fetchmail accessing an 
IMAP mailbox+folder. (I think the latter is documented in the Wiki.)

You may widen the autolearn thresholds so that fewer messages are fed 
automatically to the Bayes DB.

> Another issue I have is that we have 2 loadbalanced exim servers for 
> tagging spam,
> yet I would like to keep the bayes DB the same on both hosts. Did anyone 
> ever come
> up with a solution to this problem?

Yes, a RDBMS backend for the Bayes database (MySQL here). Otherwise you 
might elect one server as "master" and align DBs nightly (spamd 
restart!). Or stay with mis-aligned Bayes DBs: if your servers route a 
lot of msgs/day (n*10k) and are round-robin balanced, they'll be 
statistically identical. Same goes for AWL, if used.

HTH,
Paolo

-- 
|    QRPp-I #707  + www.paolocravero.tk +  I QRP #476   |
| SpamAssassin-based email antispam/antivirus solutions |
  \    Italian/English-to/from-Croatian translations    /
   \                   Skype: pcravero                 /