You are viewing a plain text version of this content. The canonical link for it is here.
Posted to ruleqa@spamassassin.apache.org by Henrik Krohns <he...@hege.li> on 2018/09/27 05:06:41 UTC
Please use --after in mass-checks
Hello mass checkers,
Please notice the --after clauses added to automasscheck.cf.
# Use --after selector for corpus to prevent unnecessary processing.
# Current ruleqa settings: ham 6 years, spam 2 months
# Anything older than that will be ignored by ruleqa regardless.
run_all_masschecks() {
### sample: single corpus ###
run_masscheck single-corpus \
--after=-174182400 ham:dir:/path/to/Maildir/.Ham/ \
--after=-4838400 spam:dir:/path/to/Maildir/.Spam/
Some of you are submitting spam older than 8 weeks. While it breaks
nothing, it's just wasting your own resources since ruleqa will filter it
anyway. :-)
$ find spam*log -mtime -30 | while read -r f; do echo === $f; perl -ne 'next unless /\btime=(\d+)/; $age = (time-$1)/604800; print "$age\n"' < $f | histogram; done
(looking at weeks here)
=== spam-darxus.log
Count: 43283
Range: 0.203 - 299.126; Mean: 127.438; Median: 133.250; Stddev: 78.538
Percentiles: 90th: 234.266; 95th: 250.646; 99th: 289.647
0.203 - 1.090: 5 |
1.090 - 2.629: 9 |
2.629 - 5.301: 30 |
5.301 - 9.943: 1005 ####
9.943 - 18.003: 2509 #########
18.003 - 32.001: 3413 #############
32.001 - 56.309: 4700 ##################
56.309 - 98.521: 5097 ###################
98.521 - 171.826: 12462 ###############################################
171.826 - 299.126: 14053 #####################################################
=== spam-grenier.log
Count: 3345
Range: 0.292 - 306.112; Mean: 164.772; Median: 186.234; Stddev: 64.757
Percentiles: 90th: 236.786; 95th: 240.271; 99th: 245.981
0.292 - 1.233: 3 |
1.233 - 2.859: 8 |
2.859 - 5.669: 8 |
5.669 - 10.525: 18 #
10.525 - 18.919: 31 #
18.919 - 33.424: 85 ##
33.424 - 58.494: 119 ###
58.494 - 101.821: 404 ############
101.821 - 176.701: 839 ########################
176.701 - 306.112: 1830 #####################################################
=== spam-jarif.log
Count: 1556
Range: 0.252 - 189.368; Mean: 18.291; Median: 12.747; Stddev: 15.556
Percentiles: 90th: 35.101; 95th: 36.848; 99th: 38.096
0.252 - 1.069: 72 #####
1.069 - 2.420: 94 #######
2.420 - 4.652: 203 ###############
4.652 - 8.341: 344 #########################
8.341 - 14.438: 94 #######
14.438 - 24.515: 21 ##
24.515 - 41.169: 725 #####################################################
41.169 - 189.368: 3 |
=== spam-jbrooks.log
Count: 6039
Range: 0.457 - 59.422; Mean: 13.405; Median: 10.897; Stddev: 10.720
Percentiles: 90th: 34.105; 95th: 34.978; 99th: 36.631
0.457 - 1.115: 315 ###########
1.115 - 2.070: 371 #############
2.070 - 3.455: 188 #######
3.455 - 5.465: 932 ##################################
5.465 - 8.384: 852 ###############################
8.384 - 12.619: 613 ######################
12.619 - 18.765: 1472 #####################################################
18.765 - 27.686: 391 ##############
27.686 - 40.632: 900 ################################
40.632 - 59.422: 5 |
=== spam-llanga.log
Count: 10805
Range: 0.284 - 78.659; Mean: 45.645; Median: 50.956; Stddev: 18.487
Percentiles: 90th: 66.387; 95th: 69.045; 99th: 77.248
0.284 - 0.941: 38 |
0.941 - 1.932: 71 #
1.932 - 3.431: 108 #
3.431 - 5.694: 153 ##
5.694 - 9.115: 264 ###
9.115 - 14.284: 236 ##
14.284 - 22.093: 616 ######
22.093 - 33.892: 1119 ###########
33.892 - 51.721: 3001 ###############################
51.721 - 78.659: 5199 #####################################################
Cheers,
Henrik
Re: Please use --after in mass-checks
Posted by Henrik Krohns <he...@hege.li>.
On Thu, Sep 27, 2018 at 06:03:13PM +0300, Henrik Krohns wrote:
> On Thu, Sep 27, 2018 at 07:52:14AM -0700, John Hardin wrote:
> > On Thu, 27 Sep 2018, Henrik Krohns wrote:
> >
> > >
> > >Hello mass checkers,
> > >
> > >Please notice the --after clauses added to automasscheck.cf.
> > >
> > ># Use --after selector for corpus to prevent unnecessary processing.
> > ># Current ruleqa settings: ham 6 years, spam 2 months
> > ># Anything older than that will be ignored by ruleqa regardless.
> > >run_all_masschecks() {
> > > ### sample: single corpus ###
> > > run_masscheck single-corpus \
> > > --after=-174182400 ham:dir:/path/to/Maildir/.Ham/ \
> > > --after=-4838400 spam:dir:/path/to/Maildir/.Spam/
> >
> > What are those values in terms of? delta seconds from now?
>
> Yep. I figured people don't have parsedate. :-)
>
> $ ./mass-check --help
>
> --after=N only test mails received after time_t N (negative values
> are an offset from current time, e.g. -86400 = last day)
> or after date as parsed by Time::ParseDate (e.g. '-6 months')
FYI, the server side values can be found from masses/rule-qa/reports-from-logs
# what's the max age of mail we will accept data from? (in weeks)
# TODO: maybe this should be in ~/.corpus
my $OLDEST_HAM_WEEKS = 72 * 4; # 72 months = 6 years
my $OLDEST_SPAM_WEEKS = 2 * 4; # 2 months
Re: Please use --after in mass-checks
Posted by Henrik Krohns <he...@hege.li>.
On Thu, Sep 27, 2018 at 07:52:14AM -0700, John Hardin wrote:
> On Thu, 27 Sep 2018, Henrik Krohns wrote:
>
> >
> >Hello mass checkers,
> >
> >Please notice the --after clauses added to automasscheck.cf.
> >
> ># Use --after selector for corpus to prevent unnecessary processing.
> ># Current ruleqa settings: ham 6 years, spam 2 months
> ># Anything older than that will be ignored by ruleqa regardless.
> >run_all_masschecks() {
> > ### sample: single corpus ###
> > run_masscheck single-corpus \
> > --after=-174182400 ham:dir:/path/to/Maildir/.Ham/ \
> > --after=-4838400 spam:dir:/path/to/Maildir/.Spam/
>
> What are those values in terms of? delta seconds from now?
Yep. I figured people don't have parsedate. :-)
$ ./mass-check --help
--after=N only test mails received after time_t N (negative values
are an offset from current time, e.g. -86400 = last day)
or after date as parsed by Time::ParseDate (e.g. '-6 months')
Re: Please use --after in mass-checks
Posted by John Hardin <jh...@impsec.org>.
On Thu, 27 Sep 2018, Henrik Krohns wrote:
>
> Hello mass checkers,
>
> Please notice the --after clauses added to automasscheck.cf.
>
> # Use --after selector for corpus to prevent unnecessary processing.
> # Current ruleqa settings: ham 6 years, spam 2 months
> # Anything older than that will be ignored by ruleqa regardless.
> run_all_masschecks() {
> ### sample: single corpus ###
> run_masscheck single-corpus \
> --after=-174182400 ham:dir:/path/to/Maildir/.Ham/ \
> --after=-4838400 spam:dir:/path/to/Maildir/.Spam/
What are those values in terms of? delta seconds from now?
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
From the Liberty perspective, it doesn't matter if it's a
jackboot or a Birkenstock smashing your face. -- Robb Allen
-----------------------------------------------------------------------
2 days until the 77th anniversary of the massacre at Babi Yar
Disarmament enables genocide - Registration enables disarmament