You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by William Stearns <ws...@pobox.com> on 2004/02/23 04:07:25 UTC

using sa-learn with a remote spamd, was Re: how to pipe an mbox to sa-learn (fwd)

Good evening, Tom,
	Here's a post I did a while back on remote sa-learn learning.
	Cheers,
	- Bill

---------------------------------------------------------------------------
        "You know you're drinking too much coffee when the only time
you're standing still is in an earthquake."
(Courtesy of Rich Pinkall Pollei <wh...@worldnet.att.net>)
--------------------------------------------------------------------------
William Stearns (wstearns@pobox.com).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:   http://www.stearns.org
--------------------------------------------------------------------------

---------- Forwarded message ----------
Date: Thu, 5 Feb 2004 11:33:18 -0500 (EST)
From: William Stearns <ws...@pobox.com>
To: Andy Spiegl <sp...@spiegl.de>
Cc: ML-spamassassin-talk <sp...@incubator.apache.org>,
     William Stearns <ws...@pobox.com>
Subject: Re: how to pipe an mbox to sa-learn

Good day, Andy,

On Thu, 5 Feb 2004, Andy Spiegl wrote:

> Usually I call
>  sa-learn --spam --mbox spammybox
> but to do the same on remote machines I'd like to do
>  cat spammybox | ssh server "sa-learn --spam --mbox"
> 
> But sa-learn only spits perl-errors in that case.
> 
> This doesn't:
>  cat spammybox | ssh server "sa-learn --spam"
> but sa-learn thinks the whole mbox is just ONE message.
> 
> Is that a bug or a feature or am I missing something?  :-)

	I wasn't able to get it to work either.  To enable remote 
reporting, I put together the following script.  It's also available at 
http://www.stearns.org/sa-blacklist/learn-spam.current .  You'll need to 
customize $HOME, SpamFolders, HamFolders, and ReportServers.  The contents 
of SpamFolders and HamFolders are assumed to be emails you've personally 
verified to be spams/hams.  Once reported, these folders will be renamed 
and compressed to save space, but still provide access if you need them.
	I happen to use ssh-agent to provide instant access to the remote
machines without requiring a password (see
http://www.stearns.org/doc/ssh-techniques.current.html and
http://www.stearns.org/doc/ssh-techniques-two.current.html ; once
ssh-agent is running, I type "set | grep '^SSH >~/agent").  If run from
the command line, the ssh access will work with a normal password or
whatever you use.




#!/bin/bash
#Copyright 2003 William Stearns <ws...@pobox.com>
#GPL'd.
#Is razor-report enough, or do we need to do some equivalent of spamassassin -r -d -a?

if [ -z "$HOME" ]; then
	HOME="/home/wstearns/"
fi

LOCKFILE=$HOME/learnspam.lock
[ -f "$LOCKFILE" ] && exit 0
trap "rm -f $LOCKFILE" EXIT
touch $LOCKFILE
renice +15 -p $$ >/dev/null 2>&1


#User settings:
#wildcards OK, relative dirs, OK, absolute dirs aren't.
SpamFolders="verified-spam"

#wildcards OK, relative dirs, OK, absolute dirs aren't.
HamFolders="verified-ham"

#The following are the machines (and optional usernames) to which we'll
#ssh to learn these spams into their respective bayesian databases. 
#The user we ssh under needs to have ssh set up, and needs write
#privileges to the (we assume shared) bayesian and whitelist databases.
ReportServers="localhost spamtrap@somemachine spam@somebox.domain.org"

MailDir="$HOME/mail/"
ArchiveDir="$HOME/mail/archives/"
#End of user settings


if [ -f $HOME/agent ]; then
	. $HOME/agent
	export SSH_AUTH_SOCK SSH_AGENT_PID SSH_ASKPASS
else
	echo SSH agent info not in $HOME/agent, please place there.
fi
export LC_ALL=C


for OneFolder in $SpamFolders ; do
	if [ -f "$MailDir/$OneFolder" ]; then
		echo "Reporting $MailDir/$OneFolder to the razor database."
		nice razor-report "$MailDir/$OneFolder"
	fi
done

for Server in $ReportServers ; do
	for OneFolder in $SpamFolders ; do
		if [ -f "$MailDir/$OneFolder" ]; then
			echo "========== $Server: SSSS $OneFolder"
			#sa-learn --no-rebuild --showdots --mbox --spam "$MailDir/$OneFolder"
			cat "$MailDir/$OneFolder" | ssh -o BatchMode=yes -o Compression=yes $Server \
			 'export TF=`mktemp -q /tmp/spam.XXXXXX </dev/null` && cat >>$TF && nice sa-learn --no-rebuild --showdots --mbox --spam $TF 2>&1 && [ -f $TF ] && rm -f $TF && echo Successful.' 2>/dev/null
		fi
	done

	for OneFolder in $HamFolders ; do
		if [ -f "$MailDir/$OneFolder" ]; then
			echo "========== $Server: HHHH $OneFolder"
			#sa-learn --no-rebuild --showdots --mbox --ham "$MailDir/$OneFolder"
			cat "$MailDir/$OneFolder" | ssh -o BatchMode=yes -o Compression=yes $Server \
			 'export TF=`mktemp -q /tmp/ham.XXXXXX </dev/null` && cat >>$TF && nice sa-learn --no-rebuild --showdots --mbox --ham $TF 2>&1 && [ -f $TF ] && rm -f $TF && echo Successful.' 2>/dev/null
		fi
	done

	echo "========== $Server: rebuild"
	#sa-learn --rebuild
	ssh -o BatchMode=yes $Server 'sa-learn --rebuild 2>&1' 2>/dev/null
done

DateStamp=`date +%Y%m%d%H%M`
for OneFolder in $SpamFolders $HamFolders ; do
	if [ -f "$MailDir/$OneFolder" ]; then
		echo "Saving to $OneFolder.$DateStamp"
		mv "$MailDir/$OneFolder" "$ArchiveDir/$OneFolder.$DateStamp"
		nice bzip2 -9 "$ArchiveDir/$OneFolder.$DateStamp"
	fi
done