You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by werner detter <wd...@ilum.org> on 2005/01/11 11:12:46 UTC

spamassassin + filter.sh

hey everybody on the list,

i use postfix on my mailserver, ans spamassassin for marking spammails, 
it work's really great except
one aspect:

postfix uses the filter.sh which gives the mail to spamassassin.
my only problem is, that every mail is beeing scanned, even
if they are bigger (e.g. > 2mb). the most spam is smaller then
100k so the actuall setup isn't good for the ressources on the
server.

*cut from master.cf'
filter    unix  -       n       n       -       -       pipe
                                                        flags=Rq 
user=filter argv=/usr/local/filter/filter.sh -f ${sender} -- ${recipient}

*end*

the filter.sh looks like follows:

---------------------------------------------------------
cat /usr/local/filter/filter.sh

#!/bin/sh
#
# filter.sh
#
# Simple filter to plug Anomy Sanitizer and SpamAssassin
# into the Postfix MTA
#
# From http://advosys.ca/papers/postfix-filtering.html
# Advosys Consulting Inc., Ottawa
#
# For use with:
#    Postfix 20010228 or later
#    Anomy Sanitizer revision 1.49 or later
#    SpamAssassin 2.42 or later
#
# Note: Modify the file locations to match your particular
#       server and installation of SpamAssassin.

# File locations:
# (CHANGE AS REQUIRED TO MATCH YOUR SERVER)
INSPECT_DIR=/var/spool/filter
SENDMAIL="/usr/sbin/sendmail -i"
ANOMY=/usr/anomy
SANITIZER=/usr/anomy/bin/sanitizer.pl
ANOMY_CONF=/usr/anomy/anomy.conf
ANOMY_LOG=/dev/null
SPAMASSASSIN=/usr/bin/spamassassin

export ANOMY

# Exit codes from <sysexits.h>
EX_TEMPFAIL=75
EX_UNAVAILABLE=69

cd $INSPECT_DIR || { echo $INSPECT_DIR does not exist; exit $EX_TEMPFAIL; }

# Clean up when done or when aborting.
trap "rm -f out.$$" 0 1 2 3 15

#cat | $SPAMASSASSIN -P | $SANITIZER \
#   $ANOMY_CONF 2>>$ANOMY_LOG > out.$$ || \
#   { echo Message content rejected; exit $EX_UNAVAILABLE; }

cat | $SPAMASSASSIN > out.$$ || { echo Message content rejected; exit 
$EX_UNAVAILABLE; }

$SENDMAIL "$@" < out.$$

exit $?
--------------------------------------------------------------------------------------------------------------------

my idea ist to modify the filter.sh that only mails smaller then 100k 
are given
to spamassassin, the rest shouldn't :/
 
i modified the script several times but finally don't get it working.


my skript:
----------------------------------------------------------------------------------------------------------------------
#!/bin/sh


INSPECT_DIR=/var/spool/filter
SENDMAIL="/usr/sbin/sendmail -i"
ANOMY=/usr/anomy
SANITIZER=/usr/anomy/bin/sanitizer.pl
ANOMY_CONF=/usr/anomy/anomy.conf
ANOMY_LOG=/dev/null
SPAMASSASSIN=/usr/bin/spamassassin

export ANOMY

# Exit codes from <sysexits.h>
EX_TEMPFAIL=75
EX_UNAVAILABLE=69
cd $INSPECT_DIR || { echo $INSPECT_DIR does not exist; exit $EX_TEMPFAIL; }


# Clean up when done or when aborting.
SIZE="/usr/bin/du -hsk out.$$|awk '{$1}'"
echo $SIZE
trap "rm -f out.$$" 0 1 2 3 15

if [ "$SIZE" <= "100" ]
then
        cat | $SPAMASSASSIN > out.$$ || { echo Message content rejected; 
exit $EX_UNAVAILABLE; }
        $SENDMAIL "$@" < out.$$
        exit $?
else
        cat > out.$$ || { echo Message content rejected; exit 
$EX_UNAVAILABLE; }
        $SENDMAIL "$@" < out.$$
        exit $?
fi
------------------------------------------------------------------------------------------------------------------------------

any ideas on how to implement a sizecheck into the skript. i'm really 
stuck with it
so any kind of help is appreciated.

kind regards,
werner detter















Re: spamassassin + filter.sh

Posted by Marco van den Bovenkamp <ma...@linuxgoeroe.dhs.org>.
werner detter wrote:

> hm, but even if i implement spamd/spamc - i still got the problem that 
> every mail (even if
> it's bigger then e.g. 4mb is passed through spamd/spamc then insteat of 
> spamassassin.
> please correct me if i'm wrong ....

No. Spamc will not pass messages larger than 250K to spamd by default. 
Look at the '-s' spamc option to tweak this.

-- 

		Regards,

			Marco.


Re: spamassassin + filter.sh

Posted by Matt Kettler <mk...@evi-inc.com>.
At 03:38 AM 1/13/2005, werner detter wrote:
> > Yeah, so? Why should this inhibit you from using spamd now?
>
>hm, but even if i implement spamd/spamc - i still got the problem that 
>every mail (even if
>it's bigger then e.g. 4mb is passed through spamd/spamc then insteat of 
>spamassassin.
>please correct me if i'm wrong ....

You're wrong... spamc automatically skips emails over a specified size, by 
default 250k.

Hence why I keep suggesting spamc/spamd.. spamc does what you want to do 
already....


Re: spamassassin + filter.sh

Posted by werner detter <wd...@ilum.org>.
hi again,


Matt Kettler wrote:

> At 11:53 AM 1/11/2005, werner detter wrote:
>
>> thanks for your help, migration to spamc/spamd wouldn't be the 
>> problem -> it's even
>> planned within the next half year. there is only one reason this 
>> hasn't been done so far:
>> there is no desicion from the company management if the want to use 
>> only spamc/spamd
>> or if they want to use amavis-new in combination with spamc/spamd and 
>> a virusscan (e.g. clamav-new).
>> so i have to wait for their decision - that's the problem.
>
>
> Yeah, so? Why should this inhibit you from using spamd now?

hm, but even if i implement spamd/spamc - i still got the problem that 
every mail (even if
it's bigger then e.g. 4mb is passed through spamd/spamc then insteat of 
spamassassin.
please correct me if i'm wrong ....

regards,
werner


Re: spamassassin + filter.sh

Posted by Matt Kettler <mk...@evi-inc.com>.
At 11:53 AM 1/11/2005, werner detter wrote:
>thanks for your help, migration to spamc/spamd wouldn't be the problem -> 
>it's even
>planned within the next half year. there is only one reason this hasn't 
>been done so far:
>there is no desicion from the company management if the want to use only 
>spamc/spamd
>or if they want to use amavis-new in combination with spamc/spamd and a 
>virusscan (e.g. clamav-new).
>so i have to wait for their decision - that's the problem.

Yeah, so? Why should this inhibit you from using spamd now?

>so my idea to reduce the load/ramusage of the mailserver was to just modifiy
>the filter.sh that it only passes mails smaller then 100kb to spamassassin 
>(as a fast solution/hack).
>
>why are the modifications in the filter.sh harder then the migration to 
>spamc/spamd?

The main reason is that conversion to spamc is completely trivial.... It's 
so absurdly trival that doing nearly anything else is going to be harder:

1) start spamd
2) make sure you have an init script to start spamd on boot (there's plenty 
around for the taking)
3) modify filter.sh so that SPAMASSIN is now spamc not spamassassin

That's it.. you're done.

>if [ "$SIZE" <= "100" ]


>but this doesn't work that way.

Exactly, because there's no easy way from a shell script to know how big 
your input stream is until you've already read it.

This is why modifying filter.sh is going to be harder than converting to 
spamd. Converting to spamd is easy. Doing a really strange hack to a shell 
script is difficult. 


Re: spamassassin + filter.sh

Posted by werner detter <wd...@ilum.org>.
hi matt,

Matt Kettler wrote:

> At 11:06 AM 1/11/2005, werner detter wrote:
>
>> i know that in the future i will have to use spamd/spamc but at the 
>> moment
>> i can't migrate because of several reasons. that's why i have to get 
>> 'filter.sh'
>> modified in the way that only mails smaller 100 kb are passed through 
>> spamassassin.
>
>
> Well, the modification of filter.sh might be harder than the migration 
> to spamd....
>
> What's inhibiting you from migrating to spamd? Perhaps we can help 
> make that easier for you.

thanks for your help, migration to spamc/spamd wouldn't be the problem 
-> it's even
planned within the next half year. there is only one reason this hasn't 
been done so far:
there is no desicion from the company management if the want to use only 
spamc/spamd
or if they want to use amavis-new in combination with spamc/spamd and a 
virusscan (e.g. clamav-new).
so i have to wait for their decision - that's the problem.

so my idea to reduce the load/ramusage of the mailserver was to just 
modifiy
the filter.sh that it only passes mails smaller then 100kb to 
spamassassin (as a fast solution/hack).

why are the modifications in the filter.sh harder then the migration to 
spamc/spamd?
IMHO it's just a shellskript why isn't it possible to get a sizecheck 
integrated - in your opinion.

my idea (i'm not a shellskripting guru :D) was to replace this part of 
the script:

*** cut ***
cd $INSPECT_DIR || { echo $INSPECT_DIR does not exist; exit $EX_TEMPFAIL; }

# Clean up when done or when aborting.
trap "rm -f out.$$" 0 1 2 3 15

#cat | $SPAMASSASSIN -P | $SANITIZER \
#   $ANOMY_CONF 2>>$ANOMY_LOG > out.$$ || \
#   { echo Message content rejected; exit $EX_UNAVAILABLE; }

cat | $SPAMASSASSIN > out.$$ || { echo Message content rejected; exit 
$EX_UNAVAILABLE; }

$SENDMAIL "$@" < out.$$

exit $?
** cut **

with something like

** cut ***

SIZE="/usr/bin/du -hsk out.$$|awk '{$1}'"
trap "rm -f out.$$" 0 1 2 3 15

if [ "$SIZE" <= "100" ]
then
        cat | $SPAMASSASSIN > out.$$ || { echo Message content rejected; 
exit $EX_UNAVAILABLE; }
        $SENDMAIL "$@" < out.$$
        exit $?
else
        cat > out.$$ || { echo Message content rejected; exit 
$EX_UNAVAILABLE; }
        $SENDMAIL "$@" < out.$$
        exit $?
fi

*** cut ***

but this doesn't work that way.

thanx for your responses on this hot topic :)

regards,
werner detter









Re: spamassassin + filter.sh

Posted by Matt Kettler <mk...@evi-inc.com>.
At 11:06 AM 1/11/2005, werner detter wrote:
>i know that in the future i will have to use spamd/spamc but at the moment
>i can't migrate because of several reasons. that's why i have to get 
>'filter.sh'
>modified in the way that only mails smaller 100 kb are passed through 
>spamassassin.

Well, the modification of filter.sh might be harder than the migration to 
spamd....

What's inhibiting you from migrating to spamd? Perhaps we can help make 
that easier for you.


Re: spamassassin + filter.sh

Posted by werner detter <wd...@ilum.org>.
hi,

i know that in the future i will have to use spamd/spamc but at the moment
i can't migrate because of several reasons. that's why i have to get 
'filter.sh'
modified in the way that only mails smaller 100 kb are passed through 
spamassassin.

regards,
werner



Matt Kettler wrote:

> At 05:12 AM 1/11/2005, werner detter wrote:
>
>> i use postfix on my mailserver, ans spamassassin for marking 
>> spammails, it work's really great except
>> one aspect:
>>
>> postfix uses the filter.sh which gives the mail to spamassassin.
>> my only problem is, that every mail is beeing scanned, even
>> if they are bigger (e.g. > 2mb). the most spam is smaller then
>> 100k so the actuall setup isn't good for the ressources on the
>> server.
>
>
> Using SpamAssassin by calling spamassassin isn't good at ALL for 
> resources on the server.
>
> The spamassassin script is the simplest, but by far the most 
> inefficient way to use SA. It's intended for hand-run tests and low 
> volume sites.
>
> In the long run you'll want to shift to starting spamd at system 
> startup and call spamc from filter.sh.
>
> 1) spamc automatically skips scans for really large messages
> 2) spamd will have pre-loaded an image of perl, saving a LOT of 
> resources.
> 3) spamd will have pre-parsed /usr/share/spamassassin and 
> /etc/mail/spamassassin, saving more resources.
> 4) spamd can have it's child count limited, preventing you from 
> forking an infinite number of copies of SA.
> 5) If you're really high volume, you can set it up so they run on 
> separate machines, thus separating the load of your MTA from the SA 
> scanning back end.
>
> The downsides?
>
> 1) If you edit or add rules in /etc/mail/spamassassin you've got to 
> restart.
> 2) a very few people have had their spamd's hog memory. This was more 
> of a problem in 3.0.0 than it is now in 3.0.2.
> 3) if spamd crashes (rare, but a very few seem to be posting about 
> this lately), you've got to restart it before mail scanning will resume.



Re: spamassassin + filter.sh

Posted by Matt Kettler <mk...@comcast.net>.
At 05:12 AM 1/11/2005, werner detter wrote:
>i use postfix on my mailserver, ans spamassassin for marking spammails, it 
>work's really great except
>one aspect:
>
>postfix uses the filter.sh which gives the mail to spamassassin.
>my only problem is, that every mail is beeing scanned, even
>if they are bigger (e.g. > 2mb). the most spam is smaller then
>100k so the actuall setup isn't good for the ressources on the
>server.

Using SpamAssassin by calling spamassassin isn't good at ALL for resources 
on the server.

The spamassassin script is the simplest, but by far the most inefficient 
way to use SA. It's intended for hand-run tests and low volume sites.

In the long run you'll want to shift to starting spamd at system startup 
and call spamc from filter.sh.

1) spamc automatically skips scans for really large messages
2) spamd will have pre-loaded an image of perl, saving a LOT of resources.
3) spamd will have pre-parsed /usr/share/spamassassin and 
/etc/mail/spamassassin, saving more resources.
4) spamd can have it's child count limited, preventing you from forking an 
infinite number of copies of SA.
5) If you're really high volume, you can set it up so they run on separate 
machines, thus separating the load of your MTA from the SA scanning back end.

The downsides?

1) If you edit or add rules in /etc/mail/spamassassin you've got to restart.
2) a very few people have had their spamd's hog memory. This was more of a 
problem in 3.0.0 than it is now in 3.0.2.
3) if spamd crashes (rare, but a very few seem to be posting about this 
lately), you've got to restart it before mail scanning will resume.