You are viewing a plain text version of this content. The canonical link for it is here.
Posted to sysadmins@spamassassin.apache.org by "Kevin A. McGrail" <ke...@mcgrail.com> on 2017/11/02 12:27:02 UTC
Re: Eureka: truncation of 72_active.cf
+sysadmins@s.a.o so we don't lose these conversations.
Consider asking gstein@apache.org for his help on SVN. I simply don't
play with enough tags, branches, revisions, etc.
Either there is a bug and we are injecting the wrong version, or you are
just looking at things incorrectly.
However, I think you perhaps are just doing something wrong??
This command, for example, appears to pull an SA rule set. Isn't that
the expected behavior?
svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk
trunk-new-rules-set1 | more
A trunk-new-rules-set1/rulesrc
A trunk-new-rules-set1/rulesrc/10_force_active.cf
A trunk-new-rules-set1/rulesrc/sandbox
A trunk-new-rules-set1/rulesrc/sandbox/jhardin
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_no_text.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_misc_testing.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_lotsa_money.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_in_body.cf
What do you get?
Regards,
KAM
On 11/2/2017 5:29 AM, Merijn van den Kroonenberg wrote:
> I am a bit confused about corpus revision
>
> Take for example this:
>
> Revision: 1813664
> Author: spamassassin_role
> Date: zondag 29 oktober 2017 3:47:00
> Message:
> updated scores for revision 1813595 active rules added since last
> mass-check
>
> And in the sysadmin mail from last night (rescore example)
>
> svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk
> trunk-new-rules-set1
>
> So the commit mentions revision 1813595, I assume its also mentioned
> in corpus logs and its actually checked out.
>
> BUT 1813595 is no valid revision for the spamassassin project??
>
> Its actually a revision in another apcache project.
>
> Revision: 1813595
> Author: deepak
> Date: zaterdag 28 oktober 2017 10:33:50
> Message:
> Improved: Add rat exclude files to excludes those files that does not
> need
> license header
> (OFBIZ-9856)
>
> Updated rat-excludes.txt file
> ----
> Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt
>
> So is somewhere something wrong with detecting the correct revision?
>
>
> -----Original Message----- From: Kevin A. McGrail
> Sent: Wednesday, November 01, 2017 10:49 PM
> To: David Jones ; Merijn van den Kroonenberg
> Subject: Re: Eureka: truncation of 72_active.cf
>
> On 11/1/2017 5:31 PM, David Jones wrote:
>>
>> I found another bug in the DKIM_VALID_EF rule description that needed
>> to be wrapped in a version check that was causing the ruleset
>> validation to fail with return code 4. Just committed another fix
>> that should take care of this.
>>
>>
>> The timing of the rule promotions and the masscheck validation, this
>> could take another ~40 hours to work itself out. I really want to
>> improve the way things work to speed up this cycle time.
>>
> Agreed. I am very excited to finally have this done though.
Re: Eureka: truncation of 72_active.cf
Posted by Dave Jones <da...@apache.org>.
Well crap. I found another odd dependency that throws off the masscheck
processing thrown off when commits are done outside the few hour window:
https://wiki.apache.org/spamassassin/InfraNotes2017#nitemc
automc@sa-vm1:~$ ~/svn/trunk/build/mkupdates/run_nightly
+ promote_active_rules
+ pwd
/usr/local/spamassassin/automc/svn/trunk
+ /usr/bin/perl build/mkupdates/listpromotable
HTTP get: http://ruleqa.spamassassin.org/1-days-ago?xml=1
no 'mcviewing', 'mcsubmitters' microformats on day 1
URL: http://ruleqa.spamassassin.org/1-days-ago?xml=1
+ exit 25
This is a critical cron job that sets up the new rule promotions daily.
I will dig into this one deeper but it seems that if the masscheck SA
revision gets out of sync with new commits that may not even be
ruleset-related, then days have to pass with no commits before the stars
will align properly again. Geez. What a mess with this workflow!
I think we need to carefully document the current SVN workflow and
redesign it to handle this better. The masscheck processing of rulesets
only needs to be tied to the ruleset revision staged in the rsync area
for the current 24 hour period. Maybe we need a new masscheck-specific
tag separate from the rule promotion tags today?
Rule promotions can happen at any time, multiple times a day as long as
they pass a lint check.
Masscheck validation are currently only done once a day.
My intial thoughts are parts of the scripts are using the latest SVN
revision and part are using the latest tagged revision from rule
promotions. When these get out of sync, we don't get enough
masschecking of the proper revision to keep moving everything forward.
Dave
On 11/02/2017 07:46 AM, Kevin A. McGrail wrote:
> On 11/2/2017 8:40 AM, Merijn van den Kroonenberg wrote:
>> The checkout works as you would expect. But its just very confusing a
>> revision is used which is not inside the spamassassin project.
>> It might also cause side effects in other part of the process as David
>> mentioned. But the check out part is not actually broken.
>>
>> I do have some considerable experience with subversion....the problem
>> is more what the intention of the code should be ;)
>
> My recommendation is do not try and unravel the thought process behind
> the code. Stay focused on the goal which is to produce rules and
> distribute them. Anything you do towards that goal is good. If we
> break some eggs to make some omelets, great.
>
> Ideally, I would like to publish more daily rulesets, focus on
> optimization, etc.
>
> Regards,
>
> KAM
>
Re: Eureka: truncation of 72_active.cf
Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 11/2/2017 8:40 AM, Merijn van den Kroonenberg wrote:
> The checkout works as you would expect. But its just very confusing a
> revision is used which is not inside the spamassassin project.
> It might also cause side effects in other part of the process as David
> mentioned. But the check out part is not actually broken.
>
> I do have some considerable experience with subversion....the problem
> is more what the intention of the code should be ;)
My recommendation is do not try and unravel the thought process behind
the code. Stay focused on the goal which is to produce rules and
distribute them. Anything you do towards that goal is good. If we
break some eggs to make some omelets, great.
Ideally, I would like to publish more daily rulesets, focus on
optimization, etc.
Regards,
KAM
Re: Eureka: truncation of 72_active.cf
Posted by Merijn van den Kroonenberg <me...@web2all.nl>.
The checkout works as you would expect. But its just very confusing a
revision is used which is not inside the spamassassin project.
It might also cause side effects in other part of the process as David
mentioned. But the check out part is not actually broken.
I do have some considerable experience with subversion....the problem is
more what the intention of the code should be ;)
-----Original Message-----
From: Kevin A. McGrail
Sent: Thursday, November 02, 2017 1:27 PM
To: Merijn van den Kroonenberg ; David Jones ;
sysadmins@spamassassin.apache.org
Subject: Re: Eureka: truncation of 72_active.cf
+sysadmins@s.a.o so we don't lose these conversations.
Consider asking gstein@apache.org for his help on SVN. I simply don't
play with enough tags, branches, revisions, etc.
Either there is a bug and we are injecting the wrong version, or you are
just looking at things incorrectly.
However, I think you perhaps are just doing something wrong??
This command, for example, appears to pull an SA rule set. Isn't that
the expected behavior?
svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk
trunk-new-rules-set1 | more
A trunk-new-rules-set1/rulesrc
A trunk-new-rules-set1/rulesrc/10_force_active.cf
A trunk-new-rules-set1/rulesrc/sandbox
A trunk-new-rules-set1/rulesrc/sandbox/jhardin
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_no_text.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_misc_testing.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_lotsa_money.cf
A trunk-new-rules-set1/rulesrc/sandbox/jhardin/20_MIME_in_body.cf
What do you get?
Regards,
KAM
On 11/2/2017 5:29 AM, Merijn van den Kroonenberg wrote:
> I am a bit confused about corpus revision
>
> Take for example this:
>
> Revision: 1813664
> Author: spamassassin_role
> Date: zondag 29 oktober 2017 3:47:00
> Message:
> updated scores for revision 1813595 active rules added since last
> mass-check
>
> And in the sysadmin mail from last night (rescore example)
>
> svn co -r 1813595 http://svn.apache.org/repos/asf/spamassassin/trunk
> trunk-new-rules-set1
>
> So the commit mentions revision 1813595, I assume its also mentioned in
> corpus logs and its actually checked out.
>
> BUT 1813595 is no valid revision for the spamassassin project??
>
> Its actually a revision in another apcache project.
>
> Revision: 1813595
> Author: deepak
> Date: zaterdag 28 oktober 2017 10:33:50
> Message:
> Improved: Add rat exclude files to excludes those files that does not need
> license header
> (OFBIZ-9856)
>
> Updated rat-excludes.txt file
> ----
> Modified : /ofbiz/ofbiz-framework/trunk/rat-excludes.txt
>
> So is somewhere something wrong with detecting the correct revision?
>
>
> -----Original Message----- From: Kevin A. McGrail
> Sent: Wednesday, November 01, 2017 10:49 PM
> To: David Jones ; Merijn van den Kroonenberg
> Subject: Re: Eureka: truncation of 72_active.cf
>
> On 11/1/2017 5:31 PM, David Jones wrote:
>>
>> I found another bug in the DKIM_VALID_EF rule description that needed to
>> be wrapped in a version check that was causing the ruleset validation to
>> fail with return code 4. Just committed another fix that should take
>> care of this.
>>
>>
>> The timing of the rule promotions and the masscheck validation, this
>> could take another ~40 hours to work itself out. I really want to
>> improve the way things work to speed up this cycle time.
>>
> Agreed. I am very excited to finally have this done though.