You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Felix Buenemann <Fe...@gmx.de> on 2008/09/13 10:10:02 UTC

Skip scanning for large mails

Hi,

is it possible to skip scanning with spamc for large mails? (eg. > 1MB)

I receive lots of huge mail (15-30MByte) on my server an the scanning 
takes very long for those mails, that will be ham anyways.

Best Regards,
    Felix Buenemann


Re: Skip scanning for large mails

Posted by mouss <mo...@netoyen.net>.
RobertH wrote:
>> From: mouss > 
>>
>> 1MB is probably too large. There is not much spam with such size
>> (although few ones were reported here).
>>
>>
> 
> What has the studies of the average and realistic maximum of spam email
> sizes concluded?
> 
> Was the conclusion the SA default size?
> 


I am not aware of any study.

but I just checked a junk folder of 5701 spams and found that:

- 4: have a size >= 1 Mo (2 are about 1M and 2 about 1.7M)
- 7: 256 >< 500
- 13:  100 >< 256 K
(Incidentally, no spam in the 450K - 1M range. I'll have to look at 
other spams).

In short:
- 0.42% are >= 100K
- 0.19% are >= 256K
- 0.07% >= 1M

so ther's not much benefit spending SA processing on large messages.

RE: Skip scanning for large mails

Posted by RobertH <ro...@abbacomm.net>.
> From: mouss > 
> 
> 1MB is probably too large. There is not much spam with such size
> (although few ones were reported here).
> 
> 

What has the studies of the average and realistic maximum of spam email
sizes concluded?

Was the conclusion the SA default size?

 - rh


Re: Skip scanning for large mails

Posted by mouss <mo...@netoyen.net>.
Felix Buenemann wrote:
> Hi,
> 
> is it possible to skip scanning with spamc for large mails? (eg. > 1MB)
> 
> I receive lots of huge mail (15-30MByte) on my server an the scanning 
> takes very long for those mails, that will be ham anyways.


1MB is probably too large. There is not much spam with such size 
(although few ones were reported here).


http://spamassassin.apache.org/full/3.2.x/doc/spamc.html

-s max_size, --max-size=max_size
     Set the maximum message size which will be sent to spamd -- any 
bigger than this threshold and the message will be returned unprocessed 
(default: 500 KB). If spamc gets handed a message bigger than this, it 
won't be passed to spamd. The maximum message size is 256 MB.

     The size is specified in bytes, as a positive integer greater than 
0. For example, -s 500000.


You can also skip spamc altogether if the tool you use to call it allows 
you to do so.

Re: Skip scanning for large mails

Posted by Gene Heskett <ge...@verizon.net>.
On Saturday 13 September 2008, mouss wrote:
>Gene Heskett wrote:
>> There are rumors floating around that the python being shipped by
>> redhat/fedora is about 100x slower than python installed from the
>> tarballs.
>
>python? do you mean perl?
>
Possibly, at my age, CRS can be a problem. :)

I not that some perl was just recently replaced, is this good?

>> Can this be confirmed?
>
>See the recent thread "using RHEL / CentOS / Fedora perl?"
>
>> I have reduced the size of what gets sent thru SA in my .procmailrc, first
>> to 50k a few months ago, and just now to 20k, as I am running Fedora 8
>> here and often have lags that can last 2-3 minutes.  Am I on the right
>> track to speed this up?
>>
>> The 419 and viagra spams are both out of control here, and there have been
>> no rules updates in months that I'm aware of.  Am I not on the right list
>> to be notified?  With the apparent demise of RDJ, updates have slowed to a
>> crawl, and finally stopped, or so it appears.
>
>don't use RDJ. use a recent version of SA and use sa-update. I use 3.2.5
>with JM Sought rules and few SARE rules. The latter haven't been updated
>since long, but this is normal (they are considered stable).

From an old root crontab where several older incantations have been commented 
out, it appears your gpgkey was changed, and when I did the new import, it 
was renamed to be a .2 key.  What do I do, remove the old one and rename this 
one without the .2?

Where might these JM Sought rules be obtained, and where are they placed for 
use?

Thanks.
 

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
... The prejudices people feel about each other disappear when they get
to know each other.
		-- Kirk, "Elaan of Troyius", stardate 4372.5

Re: Skip scanning for large mails

Posted by Martin Gregorie <ma...@gregorie.org>.
On Sat, 2008-09-13 at 07:57 -0400, Gene Heskett wrote:
> I have reduced the size of what gets sent thru SA in my .procmailrc, first to 
> 50k a few months ago, and just now to 20k, as I am running Fedora 8 here and 
> often have lags that can last 2-3 minutes.  Am I on the right track to speed 
> this up?
> 
You may want to look outside SA and its immediate environment for the
source of your slowdown. 

My spam collection is currently 53 messages totalling 342 KB (min 1.9
KB, average 6.5 KB, max 42KB). This entire collection runs through spamc
in 49 seconds. 

Spamc is is running on a Thinkpad R61i (1.4 GHz Core Duo, Fedora 9)
using 100Mb/s ethernet to talk to spamd for this test, with a script
invoking it for each test message. 

Spamd is on an old (866 MHz, 512 MB) NetVista running Fedora 8 patched
up to date as of 3 hours ago, so its using Perl 8.8. In addition its
running named, so RBL lookups etc are cached on the same box.
 
I hope these figures give you something to work on.

Martin




Re: Skip scanning for large mails

Posted by mouss <mo...@netoyen.net>.
Gene Heskett wrote:
> 
> There are rumors floating around that the python being shipped by 
> redhat/fedora is about 100x slower than python installed from the tarballs.
> 

python? do you mean perl?

> Can this be confirmed?
> 

See the recent thread "using RHEL / CentOS / Fedora perl?"

> I have reduced the size of what gets sent thru SA in my .procmailrc, first to 
> 50k a few months ago, and just now to 20k, as I am running Fedora 8 here and 
> often have lags that can last 2-3 minutes.  Am I on the right track to speed 
> this up?
> 
> The 419 and viagra spams are both out of control here, and there have been no 
> rules updates in months that I'm aware of.  Am I not on the right list to be 
> notified?  With the apparent demise of RDJ, updates have slowed to a crawl, 
> and finally stopped, or so it appears.
> 

don't use RDJ. use a recent version of SA and use sa-update. I use 3.2.5 
with JM Sought rules and few SARE rules. The latter haven't been updated 
since long, but this is normal (they are considered stable).

Re: Skip scanning for large mails

Posted by Gene Heskett <ge...@verizon.net>.
On Saturday 13 September 2008, Felix Buenemann wrote:
>Andrzej Adam Filip schrieb:
>> Felix Buenemann<Fe...@gmx.de>  wrote:
>>> is it possible to skip scanning with spamc for large mails? (eg.>  1MB)
>>>
>>> I receive lots of huge mail (15-30MByte) on my server an the scanning
>>> takes very long for those mails, that will be ham anyways.
>>>
>>> Best Regards,
>>>     Felix Buenemann
>>
>> <quote src="man spamc">
>>    -s max_size, --max-size=max_size
>>        Set the maximum message size which will be sent to spamd -- any
>>        bigger than this threshold and the message will be returned
>>        unprocessed (default: 500 KB).  If spamc gets handed a message
>>        bigger than this, it won’t be passed to spamd.  The maximum
>>        message size is 256 MB.  The size is specified in bytes, as a
>>        positive integer greater than 0.  For example, -s 500000.
>> </quote>
>
>OK, so I looked in the totally wrong place for it. I looked into the
>spamc wrapper and it actually uses -s 256000, so that can't explain long
>processing times. Seems it's back to checking logs for me, thx.
>
>-- Felix

There are rumors floating around that the python being shipped by 
redhat/fedora is about 100x slower than python installed from the tarballs.

Can this be confirmed?

I have reduced the size of what gets sent thru SA in my .procmailrc, first to 
50k a few months ago, and just now to 20k, as I am running Fedora 8 here and 
often have lags that can last 2-3 minutes.  Am I on the right track to speed 
this up?

The 419 and viagra spams are both out of control here, and there have been no 
rules updates in months that I'm aware of.  Am I not on the right list to be 
notified?  With the apparent demise of RDJ, updates have slowed to a crawl, 
and finally stopped, or so it appears.

Thanks.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Redundant ACLs. 

Re: Skip scanning for large mails

Posted by Felix Buenemann <Fe...@gmx.de>.
Andrzej Adam Filip schrieb:
> Felix Buenemann<Fe...@gmx.de>  wrote:
>> is it possible to skip scanning with spamc for large mails? (eg.>  1MB)
>>
>> I receive lots of huge mail (15-30MByte) on my server an the scanning
>> takes very long for those mails, that will be ham anyways.
>>
>> Best Regards,
>>     Felix Buenemann
>
> <quote src="man spamc">
>    -s max_size, --max-size=max_size
>        Set the maximum message size which will be sent to spamd -- any
>        bigger than this threshold and the message will be returned
>        unprocessed (default: 500 KB).  If spamc gets handed a message
>        bigger than this, it won’t be passed to spamd.  The maximum
>        message size is 256 MB.  The size is specified in bytes, as a
>        positive integer greater than 0.  For example, -s 500000.
> </quote>
>
OK, so I looked in the totally wrong place for it. I looked into the 
spamc wrapper and it actually uses -s 256000, so that can't explain long 
processing times. Seems it's back to checking logs for me, thx.

-- Felix


Re: Skip scanning for large mails

Posted by Andrzej Adam Filip <an...@onet.eu>.
Felix Buenemann <Fe...@gmx.de> wrote:
> is it possible to skip scanning with spamc for large mails? (eg. > 1MB)
>
> I receive lots of huge mail (15-30MByte) on my server an the scanning
> takes very long for those mails, that will be ham anyways.
>
> Best Regards,
>    Felix Buenemann

<quote src="man spamc">
  -s max_size, --max-size=max_size
      Set the maximum message size which will be sent to spamd -- any
      bigger than this threshold and the message will be returned
      unprocessed (default: 500 KB).  If spamc gets handed a message
      bigger than this, it won’t be passed to spamd.  The maximum
      message size is 256 MB.  The size is specified in bytes, as a
      positive integer greater than 0.  For example, -s 500000.
</quote>

-- 
[pl>en: Andrew] Andrzej Adam Filip : anfi@onet.eu : anfi@xl.wp.pl
:-) your own self.
  -- Larry Wall in <19...@wall.org>