You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/07/21 21:07:34 UTC

Re: anotherone: sa-learn against imap folders eg cyrus directory

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Johannes russek writes:
> hi ye guys,
> i'm sorry, i found DMZS-sa-learn right now :)
> hm, but it calls sa-learn and therefore runs two instances of perl.
> isnt that senseless?
> what about making Mail::SpamAssassin::CmdLearn avaible as perl object?
> (that requires only a small change in the class, so that the options will be
> taken from a new constructor instead of GetOpt).
> should i do that?
> that way i could patch DMZS-sa-learn that it calls a
> Mail::SpamAssassin::CmdLearn->new() construct instead of running sa-learn.

That certainly makes sense ;)

- --j.

> regards, johannes
> 
> > -----Original Message-----
> > From: Johannes russek [mailto:johannes.russek@io-consulting.net]
> > Sent: Wednesday, July 21, 2004 7:38 PM
> > To: spamassassin-dev@incubator.apache.org
> > Subject: sa-learn against imap folders eg cyrus directory
> >
> >
> > hi ye guys.
> > i've got a cyrus 2.1 server up and running. my users do all have a
> > INBOX.spam directory where they can place the spam and where
> > sieve puts the
> > spam spamassassin detected.
> > now i want of course train spamassassin with that folders.
> > are there any scripts already written for doing so?
> > i found something for running against maildir, but cyrus doesnt
> > use maildir
> > :)
> > regards, johannes
> >
> >
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFA/r72QTcbUG5Y7woRAtGrAKCEf/elWYz2yYfZ2odptpIvzdykpgCfXK9e
H70y3jWtSkoby6yJ4XkgH3o=
=rIbg
-----END PGP SIGNATURE-----


RE: anotherone: sa-learn against imap folders eg cyrus directory

Posted by Johannes russek <jo...@io-consulting.net>.
> -----Original Message-----
> From: Malte S. Stretz [mailto:msquadrat.nospamplease@gmx.net]
> Sent: Thursday, July 22, 2004 12:25 AM
> To: spamassassin-dev@incubator.apache.org
> Subject: Re: anotherone: sa-learn against imap folders eg cyrus
> directory
>
>
> On Wednesday 21 July 2004 21:11 CET Michael Parker wrote:
> > On Wed, Jul 21, 2004 at 12:07:34PM -0700, Justin Mason wrote:
> > > Johannes russek writes:
> > > > hm, but it calls sa-learn and therefore runs two instances of perl.
> > > > isnt that senseless?
> > > > what about making Mail::SpamAssassin::CmdLearn avaible as perl
> > > > object?
>
> CmdLearn.pm doesn't exist anymore in 3.0, it all went into the sa-learn
> executable (where it belongs).
>
> > > >[...]
> > > > that way i could patch DMZS-sa-learn that it calls a
> > > > Mail::SpamAssassin::CmdLearn->new() construct instead of running
> > > > sa-learn.
> > >[...]
> >
> > Or just make calls directly into the API, that is what it is there for
> > afterall.
>
> If you think that's too complicated, you can also make Perl load the
> sa-learn executable in-process. That's done via the "do" command. Look at
> spamc/configure.pl (the block which starts with a comment "We now
> call the
> preprocessor in its own namespace") to see how you can feed parameters to
> sa-learn.
>
> As always, there are many+x ways to do this in Perl ;-)
>
> Cheers,
> Malte

great!
after all, i seem to not have look deep enough into the spamassassin source
:)
regards, johannes


>
> --
> [SGT] Simon G. Tatham: "How to Report Bugs Effectively"
>       <http://www.chiark.greenend.org.uk/~sgtatham/bugs.html>
> [ESR] Eric S. Raymond: "How To Ask Questions The Smart Way"
>       <http://www.catb.org/~esr/faqs/smart-questions.html>
>


Re: anotherone: sa-learn against imap folders eg cyrus directory

Posted by "Malte S. Stretz" <ms...@gmx.net>.
On Wednesday 21 July 2004 21:11 CET Michael Parker wrote:
> On Wed, Jul 21, 2004 at 12:07:34PM -0700, Justin Mason wrote:
> > Johannes russek writes:
> > > hm, but it calls sa-learn and therefore runs two instances of perl.
> > > isnt that senseless?
> > > what about making Mail::SpamAssassin::CmdLearn avaible as perl
> > > object? 

CmdLearn.pm doesn't exist anymore in 3.0, it all went into the sa-learn 
executable (where it belongs).

> > >[...]
> > > that way i could patch DMZS-sa-learn that it calls a
> > > Mail::SpamAssassin::CmdLearn->new() construct instead of running
> > > sa-learn.
> >[...]
>
> Or just make calls directly into the API, that is what it is there for
> afterall.

If you think that's too complicated, you can also make Perl load the 
sa-learn executable in-process. That's done via the "do" command. Look at 
spamc/configure.pl (the block which starts with a comment "We now call the 
preprocessor in its own namespace") to see how you can feed parameters to 
sa-learn.

As always, there are many+x ways to do this in Perl ;-)

Cheers,
Malte

-- 
[SGT] Simon G. Tatham: "How to Report Bugs Effectively"
      <http://www.chiark.greenend.org.uk/~sgtatham/bugs.html>
[ESR] Eric S. Raymond: "How To Ask Questions The Smart Way"
      <http://www.catb.org/~esr/faqs/smart-questions.html>

Re: anotherone: sa-learn against imap folders eg cyrus directory

Posted by Michael Parker <pa...@pobox.com>.
On Wed, Jul 21, 2004 at 12:07:34PM -0700, Justin Mason wrote:
> Johannes russek writes:
> > hi ye guys,
> > i'm sorry, i found DMZS-sa-learn right now :)
> > hm, but it calls sa-learn and therefore runs two instances of perl.
> > isnt that senseless?
> > what about making Mail::SpamAssassin::CmdLearn avaible as perl object?
> > (that requires only a small change in the class, so that the options will be
> > taken from a new constructor instead of GetOpt).
> > should i do that?
> > that way i could patch DMZS-sa-learn that it calls a
> > Mail::SpamAssassin::CmdLearn->new() construct instead of running sa-learn.
> 
> That certainly makes sense ;)
> 

Or just make calls directly into the API, that is what it is there for
afterall.

Something like this....

Michael

#!/usr/bin/perl -w

# Copyright 2004 Michael Parker
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

##
#
# Script: sa-learn-IMAP-spam.pl
# Description:
#  This script provides learning capabilities for IMAP mailboxes.  Specifically
#  this script learns spam from a given spam mailbox.  To use, update script
#  to point to common spam, processed spam and error IMAP mailboxes.  Then
#  invoke with the proper imapserver, username and password options.  You
#  can turn on debug to test.
#
# NOTE: This code requires the SA 3.0 API, the change to < 3.0 API is trivial,
#       but left as an exercise to the reader.
#
##

use strict;

use Getopt::Long;

use Mail::IMAPClient;
use Mail::SpamAssassin 3.0000;

my $debug = 0;

my %opt;

GetOptions("imapserver=s" => \$opt{imapserver},
           "username=s" => \$opt{username},
           "password=s" => \$opt{password},
           );


# Change these to something sane for your setup
my $spaminfolder = "INBOX.process.spam";
my $spamrptfolder = "INBOX.process.spam-reported";
my $spamerrfolder = "INBOX.process.spam-error";

my $imap = Mail::IMAPClient->new(
                                 Server		=> $opt{imapserver}, 
                                 User		=> $opt{username},
                                 Password	=> $opt{password},
                                 Port		=> "143",
                                 Peek           => "1",
                                 Debug		=> $debug > 1,
                                 Uid		=>	'0', 
                                 Clear		=>	'5', 
                                 )
    || die ("Could not connect to server: $! $?\n");

my $spamass = Mail::SpamAssassin->new( { 'debug' => $debug } );
$spamass->init(1);

my $message_count = $imap->message_count($spaminfolder) || 0;

$imap->select($spaminfolder);

my @msgs = $imap->search("ALL");

my $learncount = 0;
my $errcount = 0;

foreach my $m (@msgs) {
    my $raw_message = $imap->message_string($m);
    $raw_message =~ s/\r\n/\n/g;
    my $mail = $spamass->parse($raw_message);

    my $status = $spamass->learn($mail, undef, 1);

    if ($status->did_learn()) {
        $imap->move($spamerrfolder,$m);
        $errcount++;
    }
    else {
        $imap->move($spamrptfolder,$m);
        $learncount++;
    }
}

$imap->expunge;

print "Processed $message_count Messages\n";
print "Learned: $learncount\tError: $errcount\n";