You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Rob Fantini <ro...@fantinibakery.com> on 2005/03/05 18:36:53 UTC

mailx vs pine local mail scan times

Hello,
  - mailx [command line] mail is processed 5x faster than mail sent 
using pine.

  we're using Gentoo.
  software versions:
  mail-filter/spamassassin-ruledujour-20050106
  mail-filter/spamassassin-3.0.2-r1
  mail-client/pine-4.62

  some of local.cf:
   trusted_networks 192.168/16 127/8
   internal_networks 192.168/16 127/8
   whitelist_from  *.fantinibakery.com


  I can post from logs in debug mode, and show other config settings. 
Before sending all that detail I wondered if anyone could suggest a 
solution based upon this little bit of info?


  the mail is sent to an account on the same computer.  here is output 
from logs:

pine
----
Mar  5 12:22:02 fbc3 spamd[11137]: connection from bc3.fantinibakery.com 
[127.0.0.1] at port 38516
Mar  5 12:22:02 fbc3 spamd[11137]: info: setuid to rob succeeded
Mar  5 12:22:02 fbc3 spamd[11137]: processing message 
<Pi...@fbc3.fantinibakery.com> for rob:700.
Mar  5 12:22:07 fbc3 spamd[11137]: clean message (-5.4/4.0) for rob:700 
in 5.8 seconds, 559 bytes.
Mar  5 12:22:07 fbc3 spamd[11137]: result: . -5 - 
ALL_TRUSTED,AWL,BAYES_00 
scantime=5.8,size=559,mid=<Pi...@fbc3.fantinibakery.com>,bayes=0,autolearn=ham


mailx
-----
Mar  5 12:22:47 fbc3 spamd[11138]: connection from 
fbc3.fantinibakery.com [127.0.0.1] at port 38520
Mar  5 12:22:47 fbc3 spamd[11138]: info: setuid to rob succeeded
Mar  5 12:22:47 fbc3 spamd[11138]: processing message 
<20...@fbc3.fantinibakery.com> for rob:700.
Mar  5 12:22:48 fbc3 spamd[11138]: clean message (-4.7/4.0) for rob:700 
in 1.1 seconds, 438 bytes.
Mar  5 12:22:48 fbc3 spamd[11138]: result: . -4 - 
ALL_TRUSTED,AWL,BAYES_00,RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK 
scantime=1.1,size=438,mid=<20...@fbc3.fantinibakery.com>,bayes=4.32590542925881e-06,autolearn=ham

thanks.Rob

Re: mailx vs pine local mail scan times

Posted by Bob Proulx <bo...@proulx.com>.

Rob Fantini wrote:
>  I wonder if a header could be added to a mail from postfix when this 
> part of /etc/postfix/main.cf sees a mail as from local?
> 
> smtpd_recipient_restrictions =
>      permit_mynetworks,
> 
>  The new header could be checked in procmailrc..

Hmm...  I just thought of this on the fly and so there may be
something I am not thinking about.  But it seems easy enough to do
this using postfix's PREPEND option.  This requires postfix 2.1 or
later.

Use a PREPEND to place a new mail header of your choosing.  If that
header exists then run through spamassassin.  If not then you know it
is local and can bypass spamassassin.

  cat /etc/postfix/ext-access.regexp
  /./  PREPEND  X-External-Message: yes

Then in the postfix main.cf file:

  smtpd_recipient_restrictions = 
          permit_mynetworks,
          reject_unauth_destination,
          reject_invalid_hostname,
          reject_non_fqdn_hostname,
          reject_non_fqdn_sender,
          reject_non_fqdn_recipient,
          reject_unknown_sender_domain,
          reject_unknown_recipient_domain,
          check_helo_access hash:/etc/postfix/helo-access,
          check_recipient_access regexp:/etc/postfix/ext-access.regexp,
          check_sender_access hash:/etc/postfix/client-access,
          ... reject_rbl_client ...your list here...,
          ... warn_if_reject reject_rbl_client ...your list here...

Then modify the procmail rule to call spamassassin whenever this
header is present in the mail.  Since all external mail has this
header all external mail goes through SA.

  :0fw
  * ^X-External-Message: yes
  | spamassassin

This header would not need to be secret.  If someone forged it then
their mail would be checked by spamassassin.  Only the absence of the
header could bypass the check.  External mail can't avoid it because
it is placed there by your external mail relay.

The danger would be that someday you modify the postfix rules and this
header gets lost.  At that time a lot of spam would pass through.  But
I am sure your users would let you know about that soon enough.

Once again let me warn that I have not thought the above through in
any great detail.  Your suggestion just made me think of this as a way
to do what you were wanting.  It is not something I care about greatly
because I run local mail through spamassassin and splitting out local
mail is not really something I will be pursuing.  But I did test the
above configuration and it worked for me.

Bob

Re: mailx vs pine local mail scan times

Posted by Rob Fantini <ro...@fantinibakery.com>.

Bob Proulx wrote:

> If you can ensure that mail on your network is
> not forged....

  We use postfix , procmail and spamassassin.

  I wonder if a header could be added to a mail from postfix when this 
part of /etc/postfix/main.cf sees a mail as from local?

smtpd_recipient_restrictions =
      permit_mynetworks,

  The new header could be checked in procmailrc..

Re: mailx vs pine local mail scan times

Posted by Bob Proulx <bo...@proulx.com>.

Rob Fantini wrote:
> Bob Proulx wrote:
> > To improve the accuracy you need to avoid whitelists. 
> Should I avoid whitelists them altogether, or just for local networks 
> checking?

The real problem is forgeries and spoofs.  Anyone can put any from
address they want on a mail message.  Viruses especially do this
routinely.  Any whitelist based only on the From: address will be
fooled by these.  You whitelist your network and those will pass right
through the checks.  If you can ensure that mail on your network is
not forged then whitelists for your network will be fine.  But if not,
then some viruses will undoubted forge your address and fool your
whitelists.

On my network I try hard to make sure that spoofed mail address from
my own domain cannot enter my domain.  But it is hard.  I really can't
do it.  For example this message to the mailing list leaves my
network, goes to the mailing list, then comes back into my network.
The message contains my From: address.  Any whitelist I would have on
my domain would be fooled if that were spoofed.

Because of this problem I don't like any algorithm that by design
trusts the user.  "Who goes there, friend or foo?"  "Friend!"  "Well,
okay fine, you may pass."  Therefore I don't like simple "From: name"
whitelists.  They have that fundamental flaw.  I always try to avoid
them.

So then you ask what is the alternative?  In spamassassin it follows
the chain of hosts through the trusted_networks variable backtracking
through the Received: headers.  When it finds the point that mail
enterred your network it can use that foreign machine's IP address and
perform network checks.  If the mail never left the network it sets
ALL_TRUSTED which is good for negative points pushing the message to
the non-spam classification.

It would be great to have that capability available as a standalone
script outside of the full spamassassin check.  It was a check like
that I was suggesting to really know if the mail came from your
network.  But as far as I know it is not available outside of
spamassassin at this time.  If someone had the inclination they could
write that check in a standalone form.

Bob

Re: mailx vs pine local mail scan times

Posted by Rob Fantini <ro...@fantinibakery.com>.

Bob Proulx wrote:
  > How are you calling spamassassin?  Are you calling it through
> procmail?  
  Yes

> If so then you can use procmail to avoid calling
> spamassasin in those cases.  The easiest thing would be to avoid
> processing through spamassassin if the from address were on your
> network.
> 
>   :0fw
>   * !^From: .*@([^.]+\.)?example.com
>   | spamassassin

  Thank you, that is just what I was looking for.


> To improve the accuracy you need to avoid whitelists. 
  Should I avoid whitelists them altogether, or just for local networks 
checking?

Re: mailx vs pine local mail scan times

Posted by Bob Proulx <bo...@proulx.com>.

Rob Fantini wrote:
>   Is there a way to disable spamassassin from processing mail sent to 
> our local network from our local network?

How are you calling spamassassin?  Are you calling it through
procmail?  If so then you can use procmail to avoid calling
spamassasin in those cases.  The easiest thing would be to avoid
processing through spamassassin if the from address were on your
network.

  :0fw
  * !^From: .*@([^.]+\.)?example.com
  | spamassassin

This runs the risk that you will see spam from your forged addresses.
Those are called joe-jobs.  But if that does not bother you too much
then this works.

To improve the accuracy you need to avoid whitelists.  If you had a
mail filter program that checked that the message originated on your
network then you could use that as part of the procmail check instead
of just the From: address.  I don't happen to have one handy to post.
But if someone else did I would be interested in something like that
myself.

Bob

Re: mailx vs pine local mail scan times

Posted by Rob Fantini <ro...@fantinibakery.com>.

 >At very least, try this test with 3 consecutive mails per client to 
get >some feel of varriance in network lookup times.

   I ran the tests you suggested and sure enough it
averages about the same for pine and mailx mails to be processed.


   Is there a way to disable spamassassin from processing mail sent to 
our local network from our local network?

Re: mailx vs pine local mail scan times

Posted by Matt Kettler <mk...@comcast.net>.

At 12:36 PM 3/5/2005, Rob Fantini wrote:
>Hello,
>  - mailx [command line] mail is processed 5x faster than mail sent using 
> pine.

Your difference in time is 1.1 second vs 5.8 seconds.  Is that always 
consistent over a large set of emails?

I can see from your results you've got network checks enabled, including 
razor. That's going to introduce considerable variability between scan 
times for a single message. Sometimes razor takes a half second to respond, 
other times it takes several seconds.

With network checks enabled it's pretty difficult to compare scan times 
unless you have a huge sample set (>20,000 messages) to average together.

Also, this particular test is slightly questionable because the two emails 
are back-to-back, and the first one is slower. The second scan is going to 
benefit from the DNS resolver caching the DNS query results from the first 
message.

At very least, try this test with 3 consecutive mails per client to get 
some feel of varriance in network lookup times.