You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whimsical.apache.org by sebb <se...@gmail.com> on 2017/06/13 15:45:47 UTC

Apmail jobs - speed up to allow more frequent runs

The apmail listing jobs (mods, subs) are generally quite expensive to run.
However the output does not change very frequently.

So a possible approach might be to run a cheap(er) check to look for
changes to the source files and only run the extraction when there is
a change.

This should allow list-subs and list-mods to be run hourly rather than 6-hourly.

I'm happy to look at how to implement this if people think it's worth pursuing?

Re: Apmail jobs - speed up to allow more frequent runs

Posted by Sam Ruby <ru...@intertwingly.net>.
On Tue, Jun 13, 2017 at 11:45 AM, sebb <se...@gmail.com> wrote:
> The apmail listing jobs (mods, subs) are generally quite expensive to run.
> However the output does not change very frequently.
>
> So a possible approach might be to run a cheap(er) check to look for
> changes to the source files and only run the extraction when there is
> a change.
>
> This should allow list-subs and list-mods to be run hourly rather than 6-hourly.
>
> I'm happy to look at how to implement this if people think it's worth pursuing?

Please do.  You might want to first check to see if I am being too conservative.

From memory: the scripts ran faster than I expected.  IIRC, the
longest times was 'paging back in' the contents of directories into
cache, as in if you ran the script the second time it would run much
faster than the first time.

Also, running "ls -ltr lists/*/*/Log" indicates that there is likely
*some* change to at least one mailing list every hour.  "ls -ltr
lists/*/*/mod/Log" indicates that moderation changes are considerably
less frequent, as in every couple of days there is a batch update.

Perhaps it might be worth exploring splitting the updates by DNS
address?  As in, once every 10 to 15 minutes look for a changed log
file, and if found, extracting the mods or subs as the case may be and
sending it over.  At the moment, there are only two whimsy tools that
parse this data, so changing the structure of the data would not
require much work.

- Sam Ruby