You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Sebastian Arcus <s....@open-t.co.uk> on 2017/09/15 10:11:49 UTC

SA not receiving fixed FORGED_MUA_MOZILLA update?

I am having problems with false positives for FORGED_MUA_MOZILLA for 
Yahoo emails. I see this has been already dealt with here and pushed to 
the 3.4 and trunk branches:

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7411

However, even after running sa-update, the file 20_meta_tests.cf still 
doesn't have the changes on my servers. Has this bugfix not been applied 
for some reason?

20_meta_tests.cf on my machines is dated May 17 and June 24 for the 
3.00xx and 4.00xx dirs under /var/lib/spamassassin - which is after the 
date the bugfix was pushed (2017-04-22).

I run SA 3.4.1 and 4.0.0 on different machines

Re: SA not receiving fixed FORGED_MUA_MOZILLA update?

Posted by Merijn van den Kroonenberg <me...@web2all.nl>.
> On 15/09/17 11:41, Kevin A. McGrail wrote:
>> On 9/15/2017 6:11 AM, Sebastian Arcus wrote:
>>> I am having problems with false positives for FORGED_MUA_MOZILLA for
>>> Yahoo emails. I see this has been already dealt with here and pushed
>>> to the 3.4 and trunk branches:
>>>
>>> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7411
>>>
>>> However, even after running sa-update, the file 20_meta_tests.cf still
>>> doesn't have the changes on my servers. Has this bugfix not been
>>> applied for some reason?
>>>
>>> 20_meta_tests.cf on my machines is dated May 17 and June 24 for the
>>> 3.00xx and 4.00xx dirs under /var/lib/spamassassin - which is after
>>> the date the bugfix was pushed (2017-04-22).
>>>
>>> I run SA 3.4.1 and 4.0.0 on different machines
>>
>> Hi Sebastian,
>>
>> The Rule Promotion and Generation of tarballs with correct scores has
>> been an ongoing issue for the SpamAssassin team.  I would suggest you
>> simply copy the rule to your local.cf at this time.
>
> Hi Kevin,
>
> Thank you for the reply. Does that mean that no new rules have been
> pushed to SA installations in the past 5 months - or only some rules get
> pushed through?
>

Basically there have been no rule changes pushed since march.

That is, at some point new rules were pushed, but as they were incorrect,
those changes were reverted and the old set (march) was pushed again.



Re: getting help with SA sysadmin

Posted by Merijn van den Kroonenberg <me...@web2all.nl>.
> On Sep 15, 2017, at 12:24 PM, David Jones <dj...@ena.com> wrote:
>> You kinda have to work backwards through the scripts to find what is
>> generating the scores-set0 file and turning it into 72_scores.cf.  I am
>> grep'ing through the work dir on the SA server now but it contains a lot
>> of files.  I need to find the large dirs and exclude them.
>
> you may have already done this, but if you modify the scripts to not
> overwrite (or save a copy) of the intermediate files (which may clue into
> exactly where the problem is being introduced). ie. runGA lines 57-59,
> 124-132 (for 50_scores.cf)

Yes it would be very interesting to look at intermediate / temp files.
Especially of everything generated by
/masses/rule-update-score-gen/generate-new-scores.sh and the 'make' of the
freq files.

It seems the scoreset files are truncated. There are never rule names
starting with 'S' or higher present in the scoresets. I think we need to
backtrack the intermediate files to see at which stage we see the
truncation happen. Then we could have at least an idea in which part of
the code the problem is, because the whole 'make freqs' is pretty dense..


>
> another 'easy' test I would try would be to set numcpus in runGA to 1 just
> in case the problem is that somewhere there are multiple writers
> overwriting parts of the same file
>
> --
> Daniel J. Luke
>
>
>
>



Re: getting help with SA sysadmin

Posted by David Jones <dj...@ena.com>.
On 09/15/2017 03:33 PM, Daniel J. Luke wrote:
> On Sep 15, 2017, at 12:24 PM, David Jones <dj...@ena.com> wrote:
>> You kinda have to work backwards through the scripts to find what is generating the scores-set0 file and turning it into 72_scores.cf.  I am grep'ing through the work dir on the SA server now but it contains a lot of files.  I need to find the large dirs and exclude them.
> 
> you may have already done this, but if you modify the scripts to not overwrite (or save a copy) of the intermediate files (which may clue into exactly where the problem is being introduced). ie. runGA lines 57-59, 124-132 (for 50_scores.cf)
> 

I have already restructured the script to use a common tmp dir on the 
server a few months ago.  Before that the scripts were writing all over 
the place in various home dirs on the old VM that crash/went down back 
in mid March.

The issue is there is so much junk getting checked out of SVN in various 
subdirs that it's hard to follow through the logic in all of the 
scripts.  You can look at the generate-new-scores.sh and quickly find 
your eyes crossing and get lost.  I made some local notes in a text file 
trying to map out the scripts and the files they create but it's kinda a 
mess.  I guess need to dive into this again and go deeper.  I spent many 
hours/days a couple of months ago trying to figure this out and it wore 
me out.


> another 'easy' test I would try would be to set numcpus in runGA to 1 just in case the problem is that somewhere there are multiple writers overwriting parts of the same file
> 

Good idea.  I tried manually overriding the script to force numcpus=1 
but it made no difference over the weekend.

-- 
David Jones

Re: getting help with SA sysadmin

Posted by "Daniel J. Luke" <dl...@geeklair.net>.
On Sep 15, 2017, at 12:24 PM, David Jones <dj...@ena.com> wrote:
> You kinda have to work backwards through the scripts to find what is generating the scores-set0 file and turning it into 72_scores.cf.  I am grep'ing through the work dir on the SA server now but it contains a lot of files.  I need to find the large dirs and exclude them.

you may have already done this, but if you modify the scripts to not overwrite (or save a copy) of the intermediate files (which may clue into exactly where the problem is being introduced). ie. runGA lines 57-59, 124-132 (for 50_scores.cf)

another 'easy' test I would try would be to set numcpus in runGA to 1 just in case the problem is that somewhere there are multiple writers overwriting parts of the same file

-- 
Daniel J. Luke




Re: getting help with SA sysadmin

Posted by "Daniel J. Luke" <dl...@geeklair.net>.
On Sep 15, 2017, at 12:24 PM, David Jones <dj...@ena.com> wrote:
> 1. Actually start here with the runGA call:
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/masses/rule-update-score-gen/generate-new-scores.sh?revision=1798589&view=markup#l271
> 
> 2. Here is the runGA script (not changed in almost 8 years):
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/masses/runGA?view=log
> 
> 3. Somewhere in the bottom of the runGA script there is a problem generating a complete scores-set0 which becomes the 72_scores.cf:
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/
> 
> 4. This shows the resulting different in the last good 72_scores.cf and the latest version:
> 
> http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/72_scores.cf?r1=1786976&r2=1808406

I don't know if it helps, but I (manually) removed the lines that looked just like score changes from that diff and I get the following (on the assumption we want to have a good example of what to look for after making changes to know that this is 'fixed'):

-score AC_SPAMMY_URI_PATTERNS1               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS10              1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS11              1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS12              1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS2               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS3               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS4               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS8               1.000 1.000 1.000 1.000
-score AC_SPAMMY_URI_PATTERNS9               1.000 1.000 1.000 1.000
-score ADVANCE_FEE_2_NEW_FORM                1.000 1.000 1.000 1.000
-score ADVANCE_FEE_2_NEW_MONEY               1.997 0.001 1.997 0.001
-score AXB_XM_FORGED_OL2600                  1.190 2.699 1.190 2.699
-score BODY_EMPTY                            1.997 1.999 1.997 1.999
-score CANT_SEE_AD                           2.996 0.500 2.996 0.500
-score CN_B2B_SPAMMER                        0.001 0.001 0.001 0.001
-score COMMENT_GIBBERISH                     1.498 1.499 1.498 1.499
-score ENCRYPTED_MESSAGE                     -1.000 -1.000 -1.000 -1.000
-score FORM_LOW_CONTRAST                     1.000 1.000 1.000 1.000
-score FREEMAIL_DOC_PDF_BCC                  2.596 2.599 2.596 2.599
-score FREEMAIL_FORGED_FROMDOMAIN            0.001 0.199 0.001 0.199
-score FROM_MISSPACED                        0.001 0.001 0.001 0.001
-score FROM_MISSP_SPF_FAIL                   0.001 1.000 0.001 1.000
-score FROM_MISSP_XPRIO                      0.001 0.001 0.001 0.001
-score FROM_WORDY_SHORT                      1.000 1.000 1.000 1.000
-score GOOGLE_DOCS_PHISH                     1.000 1.000 1.000 1.000
-score GOOGLE_DOCS_PHISH_MANY                1.000 1.000 1.000 1.000
-score GOOG_MALWARE_DNLD                     1.000 1.000 1.000 1.000
-score HEADER_FROM_DIFFERENT_DOMAINS         0.001 0.001 0.001 0.001
-score HEXHASH_WORD                          1.000 1.000 1.000 1.000
-score HK_RANDOM_FROM                        0.998 0.001 0.998 0.001
-score HK_SCAM_N15                           1.935 2.499 1.935 2.499
-score HTML_OFF_PAGE                         1.000 1.000 1.000 1.000
-score LIST_PRTL_PUMPDUMP                    1.000 1.000 1.000 1.000
-score LONG_HEX_URI                          2.194 2.290 2.194 2.290
-score LOTS_OF_MONEY                         0.001 0.001 0.001 0.001
-score LOTTO_DEPT                            0.001 0.001 0.001 0.001
-score LUCRATIVE                             1.000 1.000 1.000 1.000
-score MIMEOLE_DIRECT_TO_MX                  1.445 0.381 1.445 0.381
-score MIME_NO_TEXT                          1.000 1.000 1.000 1.000
-score MONEY_FRAUD_3                         2.896 0.001 2.896 0.001
-score MONEY_FRAUD_5                         3.096 0.001 3.096 0.001
-score MONEY_FRAUD_8                         2.548 0.001 2.548 0.001
-score MONEY_LOTTERY                         2.498 1.611 2.498 1.611
-score MSGID_NOFQDN1                         2.395 3.299 2.395 3.299
-score MSM_PRIO_REPTO                        2.497 0.180 2.497 0.180
-score NSL_RCVD_FROM_USER                    0.548 0.001 0.548 0.001
-score NSL_RCVD_HELO_USER                    1.273 0.001 1.273 0.001
-score PHP_NOVER_MUA                         1.000 1.000 1.000 1.000
-score PHP_ORIG_SCRIPT                       0.502 2.499 0.502 2.499
-score PHP_SCRIPT_MUA                        1.000 1.000 1.000 1.000
-score PP_MIME_FAKE_ASCII_TEXT               0.429 0.001 0.429 0.001
-score PP_TOO_MUCH_UNICODE02                 0.500 0.500 0.500 0.500
-score PP_TOO_MUCH_UNICODE05                 1.000 1.000 1.000 1.000
-score PUMPDUMP                              1.000 1.000 1.000 1.000
-score PUMPDUMP_MULTI                        1.000 1.000 1.000 1.000
-score RAND_HEADER_MANY                      1.000 1.000 1.000 1.000
-score RCVD_IN_MSPIKE_BL                     0.001 0.010 0.001 0.010
-score RCVD_IN_MSPIKE_H2                     0.001 -2.800 0.001 -2.800
-score RCVD_IN_MSPIKE_H3                     0.001 -0.010 0.001 -0.010
-score RCVD_IN_MSPIKE_H4                     0.001 -0.010 0.001 -0.010
-score RCVD_IN_MSPIKE_H5                     0.001 -1.000 0.001 -1.000
-score RCVD_IN_MSPIKE_L2                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_L3                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_L4                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_L5                     0.001 0.001 0.001 0.001
-score RCVD_IN_MSPIKE_WL                     0.001 -0.010 0.001 -0.010
-score RCVD_IN_MSPIKE_ZBI                    0.001 0.001 0.001 0.001
-score RP_MATCHES_RCVD                       -1.050 -0.001 -1.050 -0.001
-score SHARE_50_50                           2.121 1.818 2.121 1.818
-score SPOOFED_FREEM_REPTO                   2.498 1.368 2.498 1.368
-score SPOOFED_FREEM_REPTO_CHN               1.000 1.000 1.000 1.000
-score STATIC_XPRIO_OLE                      1.997 0.001 1.997 0.001
-score STOCK_LOW_CONTRAST                    2.030 2.347 2.030 2.347
-score STOCK_TIP                             1.000 1.000 1.000 1.000
-score STYLE_GIBBERISH                       2.800 3.093 2.800 3.093
-score SURBL_BLOCKED                         0.001 0.001 0.001 0.001
-score SYSADMIN                              1.000 1.000 1.000 1.000
-score THIS_AD                               0.596 2.200 0.596 2.200
-score TO_EQ_FM_DIRECT_MX                    2.497 0.622 2.497 0.622
-score TO_EQ_FM_DOM_SPF_FAIL                 0.001 0.001 0.001 0.001
-score TO_EQ_FM_SPF_FAIL                     0.001 0.001 0.001 0.001
-score TO_IN_SUBJ                            0.099 0.099 0.099 0.099
-score TO_NO_BRKTS_FROM_MSSP                 0.001 0.001 0.001 0.001
-score TO_NO_BRKTS_HTML_IMG                  0.001 2.000 0.001 2.000
-score TO_NO_BRKTS_HTML_ONLY                 1.997 0.001 1.997 0.001
-score TO_NO_BRKTS_MSFT                      2.497 0.001 2.497 0.001
-score TO_NO_BRKTS_NORDNS_HTML               0.398 0.001 0.398 0.001
-score TO_NO_BRKTS_PCNT                      2.497 0.001 2.497 0.001
-score TVD_SPACE_ENCODED                     2.497 0.001 2.497 0.001
-score TVD_SPACE_ENC_FM_MIME                 1.997 0.001 1.997 0.001
-score TVD_SPACE_RATIO_MINFP                 2.497 0.001 2.497 0.001
-score TW_GIBBERISH_MANY                     1.000 1.000 1.000 1.000
-score UC_GIBBERISH_OBFU                     1.000 1.000 1.000 1.000
-score URI_DATA                              1.000 1.000 1.000 1.000
-score URI_GOOGLE_PROXY                      0.710 1.378 0.710 1.378
-score URI_ONLY_MSGID_MALF                   0.001 1.191 0.001 1.191
-score URI_OPTOUT_3LD                        1.000 1.000 1.000 1.000
-score URI_PHISH                             3.995 3.999 3.995 3.999
-score URI_TRY_3LD                           0.195 0.001 0.195 0.001
-score URI_TRY_USME                          0.001 0.001 0.001 0.001
-score URI_WPADMIN                           3.396 3.014 3.396 3.014
-score URI_WP_DIRINDEX                       1.000 1.000 1.000 1.000
-score URI_WP_HACKED                         2.996 3.000 2.996 3.000
-score URI_WP_HACKED_2                       1.187 1.764 1.187 1.764
-score XPRIO                                 2.248 2.249 2.248 2.249
-score XPRIO_SHORT_SUBJ                      1.000 1.000 1.000 1.000
+score ADVANCE_FEE_3_NEW_FRM_MNY      0.001 2.296 0.001 2.296
+score ADVANCE_FEE_4_NEW_FRM_MNY      2.799 2.141 2.799 2.141
+score ADVANCE_FEE_4_NEW_MONEY        3.200 2.508 3.200 2.508
+score ADVANCE_FEE_5_NEW_FRM_MNY      3.199 3.099 3.199 3.099
+score ADVANCE_FEE_5_NEW_MONEY        2.976 0.558 2.976 0.558
+score AXB_X_FF_SEZ_S                 3.600 3.399 3.600 3.399
+score BODY_SINGLE_URI                0.001 1.607 0.001 1.607
+score BODY_SINGLE_WORD               2.602 0.001 2.602 0.001
+score COMPENSATION                   0.001 0.000 0.001 0.000
+score DEAR_BENEFICIARY               0.483 1.470 0.483 1.470
+score DEAR_EMAIL                     3.499 2.715 3.499 2.715
+score DX_TEXT_01                     2.699 2.599 2.699 2.599
+score FROM_MISSP_DYNIP               1.536 2.399 1.536 2.399
+score FROM_MISSP_EH_MATCH            1.685 1.263 1.685 1.263
+score HK_NAME_MR_MRS                 4.085 2.994 4.085 2.994
+score HK_SCAM_N3                     2.799 2.699 2.799 2.699
+score HTML_FONT_TINY                 2.194 2.648 2.194 2.648
+score KHOP_DYNAMIC                   3.030 1.997 3.030 1.997
+score LIST_PARTIAL_SHORT_MSG         2.499 2.276 2.499 2.276
+score MILLION_USD                    3.157 2.189 3.157 2.189

> You kinda have to work backwards through the scripts to find what is generating the scores-set0 file and turning it into 72_scores.cf.  I am grep'ing through the work dir on the SA server now but it contains a lot of files.  I need to find the large dirs and exclude them.

-- 
Daniel J. Luke




Re: getting help with SA sysadmin

Posted by David Jones <dj...@ena.com>.
On 09/15/2017 09:52 AM, Merijn van den Kroonenberg wrote:
>> On Sep 15, 2017, at 9:46 AM, David Jones <dj...@ena.com> wrote:
>>> 3. I have narrowed down the problem to the general area of a perl
>>> Makefile which builds a custom garescorer.c file which does some
>>> statistical analysis to determine the best score for rules in the
>>> 72_scores.cf.  These 72_scores.cf are excluded from 50_scores.cf (static
>>> scores) and are currently incomplete making these rules default to 1.0.
>>> Most of theses missing rules should be much higher than 1.0 causing SA
>>> to allow spam through on most installations that don't have an optimized
>>> MTA in front of SA.
>>>
>>> https://wiki.apache.org/spamassassin/InfraNotes2017#mkupdates
>>>
>>> ~/svn/trunk/build/mkupdates/mkupdate-with-scores
>>>
>>>      masses -> perl Makefile.PL && make (complete build of SA and test)
>>>          - perl hit-frequencies
>>>          - garescorer - compiles and runs it, requires build/pga
>>>                  <--- THE PROBLEM IS NEAR HERE
>>
>> where are hit-frequencies and garescorer?
> 
> Seems its here:
> http://svn.apache.org/repos/asf/spamassassin/trunk/masses/
> 
> Not sure where its tied in tho.
> 
>>
>>> I think the problem is somewhere behind line 127 and 128 which does many
>>> things/steps:
>>
>> line 127 and 128 of mkupdate-withscores? (that just looks like it's
>> building a version of SpamAssassin from trunk svn? maybe you are referring
>> to something else?)
>>


1. Actually start here with the runGA call:

http://svn.apache.org/viewvc/spamassassin/trunk/masses/rule-update-score-gen/generate-new-scores.sh?revision=1798589&view=markup#l271


2. Here is the runGA script (not changed in almost 8 years):

http://svn.apache.org/viewvc/spamassassin/trunk/masses/runGA?view=log


3. Somewhere in the bottom of the runGA script there is a problem 
generating a complete scores-set0 which becomes the 72_scores.cf:

http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/


4. This shows the resulting different in the last good 72_scores.cf and 
the latest version:

http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/72_scores.cf?r1=1786976&r2=1808406


You kinda have to work backwards through the scripts to find what is 
generating the scores-set0 file and turning it into 72_scores.cf.  I am 
grep'ing through the work dir on the SA server now but it contains a lot 
of files.  I need to find the large dirs and exclude them.

>> --
>> Daniel J. Luke
>>
>>
>>
>>
> 
> 


-- 
David Jones

Re: getting help with SA sysadmin

Posted by Merijn van den Kroonenberg <me...@web2all.nl>.
> On Sep 15, 2017, at 9:46 AM, David Jones <dj...@ena.com> wrote:
>> 3. I have narrowed down the problem to the general area of a perl
>> Makefile which builds a custom garescorer.c file which does some
>> statistical analysis to determine the best score for rules in the
>> 72_scores.cf.  These 72_scores.cf are excluded from 50_scores.cf (static
>> scores) and are currently incomplete making these rules default to 1.0.
>> Most of theses missing rules should be much higher than 1.0 causing SA
>> to allow spam through on most installations that don't have an optimized
>> MTA in front of SA.
>>
>> https://wiki.apache.org/spamassassin/InfraNotes2017#mkupdates
>>
>> ~/svn/trunk/build/mkupdates/mkupdate-with-scores
>>
>>     masses -> perl Makefile.PL && make (complete build of SA and test)
>>         - perl hit-frequencies
>>         - garescorer - compiles and runs it, requires build/pga
>>                 <--- THE PROBLEM IS NEAR HERE
>
> where are hit-frequencies and garescorer?

Seems its here:
http://svn.apache.org/repos/asf/spamassassin/trunk/masses/

Not sure where its tied in tho.

>
>> I think the problem is somewhere behind line 127 and 128 which does many
>> things/steps:
>
> line 127 and 128 of mkupdate-withscores? (that just looks like it's
> building a version of SpamAssassin from trunk svn? maybe you are referring
> to something else?)
>
> --
> Daniel J. Luke
>
>
>
>



Re: getting help with SA sysadmin

Posted by "Daniel J. Luke" <dl...@geeklair.net>.
On Sep 15, 2017, at 9:46 AM, David Jones <dj...@ena.com> wrote:
> 3. I have narrowed down the problem to the general area of a perl Makefile which builds a custom garescorer.c file which does some statistical analysis to determine the best score for rules in the 72_scores.cf.  These 72_scores.cf are excluded from 50_scores.cf (static scores) and are currently incomplete making these rules default to 1.0.  Most of theses missing rules should be much higher than 1.0 causing SA to allow spam through on most installations that don't have an optimized MTA in front of SA.
> 
> https://wiki.apache.org/spamassassin/InfraNotes2017#mkupdates
> 
> ~/svn/trunk/build/mkupdates/mkupdate-with-scores
> 
>     masses -> perl Makefile.PL && make (complete build of SA and test)
>         - perl hit-frequencies
>         - garescorer - compiles and runs it, requires build/pga                          <--- THE PROBLEM IS NEAR HERE

where are hit-frequencies and garescorer?

> I think the problem is somewhere behind line 127 and 128 which does many things/steps:

line 127 and 128 of mkupdate-withscores? (that just looks like it's building a version of SpamAssassin from trunk svn? maybe you are referring to something else?)

-- 
Daniel J. Luke




Re: getting help with SA sysadmin

Posted by David Jones <dj...@ena.com>.
On 09/15/2017 07:26 AM, Kevin A. McGrail wrote:
> On 9/15/2017 8:22 AM, Merijn van den Kroonenberg wrote:
>> So one of the problems with solving this is because getting help 
>> requires
>> a complicated enrollment.
>>
>> Maybe there are chunks of work which can be offloaded to the community
>> without requiring full sysadmin enrollment?
>>
>> I can imagine lots of scripts are involved into running the rule
>> generation. Maybe some need reviewing or something? (just throwning 
>> thing
>> in the air here)
>
> I could be forest for the trees but I haven't seen the ability for 
> anyone to make progress with minimal access.  Dave, tell me if you 
> agree, please?
>
>
> Regards,
>
> KAM
>
I hit a wall with my troubleshooting a couple of months ago. Normally I 
am able to solve very detailed issues like this without much problem.  
Here are the issues:

1. These are very old scripts written in disjointed pieces over a long 
period of time by different people with different styles. There is very 
little consistency and no proper modularity by function.

2. It takes a long time, 40+ minutes, to do a single run making 
troubleshooting very slow and difficult.  The scripts are mostly using 
very verbose bash "set -x" output for logging which is hard to dig 
through the massive output and correlate with problems. They need to 
have proper debug logging after being rewritten in modular parts.

3. I have narrowed down the problem to the general area of a perl 
Makefile which builds a custom garescorer.c file which does some 
statistical analysis to determine the best score for rules in the 
72_scores.cf.  These 72_scores.cf are excluded from 50_scores.cf (static 
scores) and are currently incomplete making these rules default to 1.0.  
Most of theses missing rules should be much higher than 1.0 causing SA 
to allow spam through on most installations that don't have an optimized 
MTA in front of SA.

https://wiki.apache.org/spamassassin/InfraNotes2017#mkupdates

~/svn/trunk/build/mkupdates/mkupdate-with-scores

     masses -> perl Makefile.PL && make (complete build of SA and test)
         - perl hit-frequencies
         - garescorer - compiles and runs it, requires 
build/pga                          <--- THE PROBLEM IS NEAR HERE

I think the problem is somewhere behind line 127 and 128 which does many 
things/steps:

http://svn.apache.org/viewvc/spamassassin/trunk/build/mkupdates/

4. Most of the ruleqa/masscheck scripts are sh or bash.  I am not a perl 
person.  I am good at bash, Python, and a few others but not perl.  I 
can hack my way though reading and updating existing perl but that's 
about it.


I have been thinking about this the past few weeks and maybe there is a 
"band-aid" option where we could merge the current 72_scores.cf rules 
that are missing with the shorter version that is being generated today 
to keep the 72_scores.cf complete and get the rule updates happening again.

Incomplete version from last night:

http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/72_scores.cf?revision=1808406&view=markup&sortby=date

Last good version:

http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/scores/72_scores.cf?revision=1786976&view=markup&sortby=date

Dave

-- 
David Jones


Re: getting help with SA sysadmin

Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 9/15/2017 8:22 AM, Merijn van den Kroonenberg wrote:
> So one of the problems with solving this is because getting help requires
> a complicated enrollment.
>
> Maybe there are chunks of work which can be offloaded to the community
> without requiring full sysadmin enrollment?
>
> I can imagine lots of scripts are involved into running the rule
> generation. Maybe some need reviewing or something? (just throwning thing
> in the air here)

I could be forest for the trees but I haven't seen the ability for 
anyone to make progress with minimal access.  Dave, tell me if you 
agree, please?


Regards,

KAM


Re: getting help with SA sysadmin (was: SA not receiving fixed FORGED_MUA_MOZILLA update?)

Posted by Merijn van den Kroonenberg <me...@web2all.nl>.
> On 9/15/2017 7:43 AM, Merijn van den Kroonenberg wrote:
>> It sounds a bit like you guys are hitting a wall?
>>
>> Could any help from the community get things going again? If so, what
>> kind
>> of skillset would be useful to tackle this thing?
>
> Yes, help is always welcome.  More hands make light work.  We setup a
> SysAdmins group at the SA project to help manage the SA Infrastructure
> just because of this issue and because I was the last holder of too much
> arcania.
>
> Off the cuff, the skills needed are:
>
> - Ethical SysAdmin agreement since you have access to private emails,
> etc. donated to the project.
>
> - Ubuntu SysAdmin Skills since that's the OS in question.
>
> - Redhat/Solaris/etc. skills helpful since old platform was on those
> platforms.
>
> And the applicant has to then be approved by the SA PMC as they need
> committer access.  Right now, we have 2 that passed with a 3rd whose
> vote is still pending but failed.  Out of the 2, the system complexity
> required more attention than he can give at this time.  So it's been me
> and Dave.

So one of the problems with solving this is because getting help requires
a complicated enrollment.

Maybe there are chunks of work which can be offloaded to the community
without requiring full sysadmin enrollment?

I can imagine lots of scripts are involved into running the rule
generation. Maybe some need reviewing or something? (just throwning thing
in the air here)

>
> Some others have offered to help like Eric Wolleson.  I'll see if I can
> unlogjam that PMC process.
>
> Regards,
>
> KAM
>
>



Re: SA not receiving fixed FORGED_MUA_MOZILLA update?

Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 9/15/2017 7:43 AM, Merijn van den Kroonenberg wrote:
> It sounds a bit like you guys are hitting a wall?
>
> Could any help from the community get things going again? If so, what kind
> of skillset would be useful to tackle this thing?

Yes, help is always welcome.  More hands make light work.  We setup a 
SysAdmins group at the SA project to help manage the SA Infrastructure 
just because of this issue and because I was the last holder of too much 
arcania.

Off the cuff, the skills needed are:

- Ethical SysAdmin agreement since you have access to private emails, 
etc. donated to the project.

- Ubuntu SysAdmin Skills since that's the OS in question.

- Redhat/Solaris/etc. skills helpful since old platform was on those 
platforms.

And the applicant has to then be approved by the SA PMC as they need 
committer access.  Right now, we have 2 that passed with a 3rd whose 
vote is still pending but failed.  Out of the 2, the system complexity 
required more attention than he can give at this time.  So it's been me 
and Dave.

Some others have offered to help like Eric Wolleson.  I'll see if I can 
unlogjam that PMC process.

Regards,

KAM


Re: SA not receiving fixed FORGED_MUA_MOZILLA update?

Posted by Merijn van den Kroonenberg <me...@web2all.nl>.
> On 9/15/2017 6:54 AM, Sebastian Arcus wrote:
>> Thank you for the reply. Does that mean that no new rules have been
>> pushed to SA installations in the past 5 months - or only some rules
>> get pushed through?
>
> The system has been "down" since March 15 in that everything is working
> but we are purposefully not changing the DNS entries.
>
> We've resurrected it a few times and Dave Jones has done some work to
> get a new system running but it published incorrect score files.  He did
> some massaging and published a new rule set with the old score file a
> few months ago.  But since them we've been battling the machine randomly
> hanging.  And work to resurrect the old boxes failed.
>
> We are looking for some little darn insidious issue not processing rule
> scores right.

It sounds a bit like you guys are hitting a wall?

Could any help from the community get things going again? If so, what kind
of skillset would be useful to tackle this thing?

>
> Regards,
>
> KAM
>
>



Re: SA not receiving fixed FORGED_MUA_MOZILLA update?

Posted by Sebastian Arcus <s....@open-t.co.uk>.
On 15/09/17 12:21, Kevin A. McGrail wrote:
> On 9/15/2017 6:54 AM, Sebastian Arcus wrote:
>> Thank you for the reply. Does that mean that no new rules have been 
>> pushed to SA installations in the past 5 months - or only some rules 
>> get pushed through?
> 
> The system has been "down" since March 15 in that everything is working 
> but we are purposefully not changing the DNS entries.
> 
> We've resurrected it a few times and Dave Jones has done some work to 
> get a new system running but it published incorrect score files.  He did 
> some massaging and published a new rule set with the old score file a 
> few months ago.  But since them we've been battling the machine randomly 
> hanging.  And work to resurrect the old boxes failed.
> 
> We are looking for some little darn insidious issue not processing rule 
> scores right.

Thank you for the update Kevin - and for the hard work of everyone 
involved. Let's hope the rules updates will be operational again soon.

Re: SA not receiving fixed FORGED_MUA_MOZILLA update?

Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 9/15/2017 6:54 AM, Sebastian Arcus wrote:
> Thank you for the reply. Does that mean that no new rules have been 
> pushed to SA installations in the past 5 months - or only some rules 
> get pushed through?

The system has been "down" since March 15 in that everything is working 
but we are purposefully not changing the DNS entries.

We've resurrected it a few times and Dave Jones has done some work to 
get a new system running but it published incorrect score files.  He did 
some massaging and published a new rule set with the old score file a 
few months ago.  But since them we've been battling the machine randomly 
hanging.  And work to resurrect the old boxes failed.

We are looking for some little darn insidious issue not processing rule 
scores right.

Regards,

KAM


Re: SA not receiving fixed FORGED_MUA_MOZILLA update?

Posted by Sebastian Arcus <s....@open-t.co.uk>.
On 15/09/17 11:41, Kevin A. McGrail wrote:
> On 9/15/2017 6:11 AM, Sebastian Arcus wrote:
>> I am having problems with false positives for FORGED_MUA_MOZILLA for 
>> Yahoo emails. I see this has been already dealt with here and pushed 
>> to the 3.4 and trunk branches:
>>
>> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7411
>>
>> However, even after running sa-update, the file 20_meta_tests.cf still 
>> doesn't have the changes on my servers. Has this bugfix not been 
>> applied for some reason?
>>
>> 20_meta_tests.cf on my machines is dated May 17 and June 24 for the 
>> 3.00xx and 4.00xx dirs under /var/lib/spamassassin - which is after 
>> the date the bugfix was pushed (2017-04-22).
>>
>> I run SA 3.4.1 and 4.0.0 on different machines
> 
> Hi Sebastian,
> 
> The Rule Promotion and Generation of tarballs with correct scores has 
> been an ongoing issue for the SpamAssassin team.  I would suggest you 
> simply copy the rule to your local.cf at this time.

Hi Kevin,

Thank you for the reply. Does that mean that no new rules have been 
pushed to SA installations in the past 5 months - or only some rules get 
pushed through?

Re: SA not receiving fixed FORGED_MUA_MOZILLA update?

Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 9/15/2017 6:11 AM, Sebastian Arcus wrote:
> I am having problems with false positives for FORGED_MUA_MOZILLA for 
> Yahoo emails. I see this has been already dealt with here and pushed 
> to the 3.4 and trunk branches:
>
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7411
>
> However, even after running sa-update, the file 20_meta_tests.cf still 
> doesn't have the changes on my servers. Has this bugfix not been 
> applied for some reason?
>
> 20_meta_tests.cf on my machines is dated May 17 and June 24 for the 
> 3.00xx and 4.00xx dirs under /var/lib/spamassassin - which is after 
> the date the bugfix was pushed (2017-04-22).
>
> I run SA 3.4.1 and 4.0.0 on different machines

Hi Sebastian,

The Rule Promotion and Generation of tarballs with correct scores has 
been an ongoing issue for the SpamAssassin team.  I would suggest you 
simply copy the rule to your local.cf at this time.

Regards,
KAM