You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whimsical.apache.org by "John D. Ament" <jo...@apache.org> on 2017/06/10 12:58:50 UTC

Regenerating static content on deploy?

I just pushed up a change that requires a change to the podlings.json
file.  However, that file is only generated in cron.  I was wondering if it
made sense that on deploy, to run a script that would regenerate the static
files?

John

Re: Regenerating static content on deploy?

Posted by Sam Ruby <ru...@intertwingly.net>.
On Sun, Jun 11, 2017 at 1:44 PM, John D. Ament <jo...@gmail.com> wrote:
>>
>> If the sum total of running all of the cron jobs is multiple minutes
>> of elapsed time, running this list of programs every time in order to
>> reduce the time window where data is out of sync seems like overkill.
>> It may also result in the one item that you are looking for to run to
>> actually be run later than expected.  I'm also concerned a bit about
>> these scripts possibly running at the same time as the cron jobs.
>>
> Agreed.  We want deployments to be fast.  Low downtime.  Do we bounce
> apache on deploy? I didn't see anything.

Apache httpd is only bounced when things like its configuration files change.

No bounce is required for CGI scripts.

Individual passenger applications are restarted when the code for that
application changes.  This is done in the Rakefile.

- Sam Ruby

Re: Regenerating static content on deploy?

Posted by "John D. Ament" <jo...@gmail.com>.
On Sun, Jun 11, 2017 at 1:22 PM Sam Ruby <ru...@intertwingly.net> wrote:

> On Sun, Jun 11, 2017 at 12:53 PM, John D. Ament <jo...@apache.org>
> wrote:
> > I decided to sleep on this before running into a response.
>
> My apologies if I caused any offense.
>
>
You didn't.  Understand, I have mostly a Java/.NET background, and most of
the scripting work I've done is in python.  But I don't write full blown
apps in Python, so I'm trying to figure out what the right approach would
be in my mind with the limited ruby experience I have.


> > On Sat, Jun 10, 2017 at 11:22 AM Sam Ruby <ru...@intertwingly.net>
> wrote:
> >
> >> On Sat, Jun 10, 2017 at 11:10 AM, John D. Ament <jo...@apache.org>
> >> wrote:
> >> > Unless someone objects, I'm going to introduce a shell script that
> >> executes
> >> > the contents of the cron jobs.  I'm then going to raise a PR to infra
> to
> >> > update this section
> >> >
> >>
> https://github.com/apache/infrastructure-puppet/blob/deployment/modules/whimsy_server/manifests/init.pp#L70
> >> > to
> >> > include a call to said script.
> >>
> >> Not an objection, but thinking out loud.
> >>
> >> Some of those cronjobs are lightning fast.  Others may take a while
> >> and consume network or CPU resources.
> >>
> >> For the fast ones, I find it mildly amusing that they use whimsy/asf
> >> library functions and capture the result in JSON, and you are parsing
> >> the JSON instead of using the same library functions.  What this means
> >> is that you get stale data when you have ready access to fresher data.
> >
> > Do you have a concrete list of these?  I often find arbitrary statements
> > like this hard to follow, mostly because of how unfamiliar I am with
> > something, in this case Whimsy.
>
> Again, my apologies... I misread the code.  You are using the library
> functions.
>
>
No, you're fine.  I had no idea you were referring to anything of mine.


> >> For the slow ones, I don't think it is appropriate to run then after
> every
> >> push.
> >>
> > Again, having a list of these would be useful.  i'm not sure I can judge
> > without having all of the information.
>
> The one that I am most familiar with is the site scan.  Sebb can
> comment on others.
>
> >> What I would suggest instead is a CGI script that lists the cron jobs
> >> that are runnable under the apache web server user id, and gives you a
> >> button that you can push that will run that specific job on request.
> >> What this would enable is for anybody (or perhaps just ASF members?)
> >> to rerun the script at will.
> >>
> > No, specifically I disagree with this kind of approach.  If we're going
> to
> > follow an automatic deploy pattern, the application needs to be self
> > healing.  If there's manual intervention something's wrong.  Yesterday we
> > got stuck because:
>
> If the sum total of running all of the cron jobs is multiple minutes
> of elapsed time, running this list of programs every time in order to
> reduce the time window where data is out of sync seems like overkill.
> It may also result in the one item that you are looking for to run to
> actually be run later than expected.  I'm also concerned a bit about
> these scripts possibly running at the same time as the cron jobs.
>
>
Agreed.  We want deployments to be fast.  Low downtime.  Do we bounce
apache on deploy? I didn't see anything.


> I am also aware that at times there are problems that only show up in
> production, and we may find ourselves wanting to run scripts because,
> for example, LDAP has been updated.
>
> > - SVN repo list is cached in the app (but not in cron)
> > - The cron jobs happened to run in such an order that the svn update
> > happened after public_podlings.json was generated
> >
> > What I'm actually now thinking is that we should implement two bug fixes:
> >
> > - in ASF::SVN if the repo doesn't exist, reload the repository.yml file
> and
> > try again.
>
> +1
>
> > - in some scripts, make sure that any repos you need are up to date
> (rather
> > than rely on the svn update cron)
>
> Absolutely.  Operative words: some scripts.  The question often comes
> down to whether it is worthwhile to have less stale information later
> or have quicker (but sometimes more stale) data responses.
>

Right - so the immediate one I can think of (and subject of all this) is
when public_podlings.rb runs make sure it does an svn update (and maybe
that can be a flag) of the incubator-content and now incubator-podlings svn
repos to ensure it has all of the latest info.


>
> Some tools, like the board agenda tool, opt to show possibly stale
> data quickly and immediately get fresher data and update if necessary.
>
> > Thoughts?
>
> - Sam Ruby
>
> >> I'd start by parsing the whimsy_server/manifests/cronjobs.pp.
> >>
> >> > John
> >> >
> >> > On Sat, Jun 10, 2017 at 8:58 AM John D. Ament <jo...@apache.org>
> >> wrote:
> >> >
> >> >> I just pushed up a change that requires a change to the podlings.json
> >> >> file.  However, that file is only generated in cron.  I was wondering
> >> if it
> >> >> made sense that on deploy, to run a script that would regenerate the
> >> static
> >> >> files?
> >> >>
> >> >> John
> >>
> >> - Sam Ruby
> >>
>

Re: Regenerating static content on deploy?

Posted by Sam Ruby <ru...@intertwingly.net>.
On Sun, Jun 11, 2017 at 12:53 PM, John D. Ament <jo...@apache.org> wrote:
> I decided to sleep on this before running into a response.

My apologies if I caused any offense.

> On Sat, Jun 10, 2017 at 11:22 AM Sam Ruby <ru...@intertwingly.net> wrote:
>
>> On Sat, Jun 10, 2017 at 11:10 AM, John D. Ament <jo...@apache.org>
>> wrote:
>> > Unless someone objects, I'm going to introduce a shell script that
>> executes
>> > the contents of the cron jobs.  I'm then going to raise a PR to infra to
>> > update this section
>> >
>> https://github.com/apache/infrastructure-puppet/blob/deployment/modules/whimsy_server/manifests/init.pp#L70
>> > to
>> > include a call to said script.
>>
>> Not an objection, but thinking out loud.
>>
>> Some of those cronjobs are lightning fast.  Others may take a while
>> and consume network or CPU resources.
>>
>> For the fast ones, I find it mildly amusing that they use whimsy/asf
>> library functions and capture the result in JSON, and you are parsing
>> the JSON instead of using the same library functions.  What this means
>> is that you get stale data when you have ready access to fresher data.
>
> Do you have a concrete list of these?  I often find arbitrary statements
> like this hard to follow, mostly because of how unfamiliar I am with
> something, in this case Whimsy.

Again, my apologies... I misread the code.  You are using the library functions.

>> For the slow ones, I don't think it is appropriate to run then after every
>> push.
>>
> Again, having a list of these would be useful.  i'm not sure I can judge
> without having all of the information.

The one that I am most familiar with is the site scan.  Sebb can
comment on others.

>> What I would suggest instead is a CGI script that lists the cron jobs
>> that are runnable under the apache web server user id, and gives you a
>> button that you can push that will run that specific job on request.
>> What this would enable is for anybody (or perhaps just ASF members?)
>> to rerun the script at will.
>>
> No, specifically I disagree with this kind of approach.  If we're going to
> follow an automatic deploy pattern, the application needs to be self
> healing.  If there's manual intervention something's wrong.  Yesterday we
> got stuck because:

If the sum total of running all of the cron jobs is multiple minutes
of elapsed time, running this list of programs every time in order to
reduce the time window where data is out of sync seems like overkill.
It may also result in the one item that you are looking for to run to
actually be run later than expected.  I'm also concerned a bit about
these scripts possibly running at the same time as the cron jobs.

I am also aware that at times there are problems that only show up in
production, and we may find ourselves wanting to run scripts because,
for example, LDAP has been updated.

> - SVN repo list is cached in the app (but not in cron)
> - The cron jobs happened to run in such an order that the svn update
> happened after public_podlings.json was generated
>
> What I'm actually now thinking is that we should implement two bug fixes:
>
> - in ASF::SVN if the repo doesn't exist, reload the repository.yml file and
> try again.

+1

> - in some scripts, make sure that any repos you need are up to date (rather
> than rely on the svn update cron)

Absolutely.  Operative words: some scripts.  The question often comes
down to whether it is worthwhile to have less stale information later
or have quicker (but sometimes more stale) data responses.

Some tools, like the board agenda tool, opt to show possibly stale
data quickly and immediately get fresher data and update if necessary.

> Thoughts?

- Sam Ruby

>> I'd start by parsing the whimsy_server/manifests/cronjobs.pp.
>>
>> > John
>> >
>> > On Sat, Jun 10, 2017 at 8:58 AM John D. Ament <jo...@apache.org>
>> wrote:
>> >
>> >> I just pushed up a change that requires a change to the podlings.json
>> >> file.  However, that file is only generated in cron.  I was wondering
>> if it
>> >> made sense that on deploy, to run a script that would regenerate the
>> static
>> >> files?
>> >>
>> >> John
>>
>> - Sam Ruby
>>

Re: Regenerating static content on deploy?

Posted by "John D. Ament" <jo...@apache.org>.
I decided to sleep on this before running into a response.

On Sat, Jun 10, 2017 at 11:22 AM Sam Ruby <ru...@intertwingly.net> wrote:

> On Sat, Jun 10, 2017 at 11:10 AM, John D. Ament <jo...@apache.org>
> wrote:
> > Unless someone objects, I'm going to introduce a shell script that
> executes
> > the contents of the cron jobs.  I'm then going to raise a PR to infra to
> > update this section
> >
> https://github.com/apache/infrastructure-puppet/blob/deployment/modules/whimsy_server/manifests/init.pp#L70
> > to
> > include a call to said script.
>
> Not an objection, but thinking out loud.
>
> Some of those cronjobs are lightning fast.  Others may take a while
> and consume network or CPU resources.
>
> For the fast ones, I find it mildly amusing that they use whimsy/asf
> library functions and capture the result in JSON, and you are parsing
> the JSON instead of using the same library functions.  What this means
> is that you get stale data when you have ready access to fresher data.
>

Do you have a concrete list of these?  I often find arbitrary statements
like this hard to follow, mostly because of how unfamiliar I am with
something, in this case Whimsy.


>
> For the slow ones, I don't think it is appropriate to run then after every
> push.
>
>
Again, having a list of these would be useful.  i'm not sure I can judge
without having all of the information.


> What I would suggest instead is a CGI script that lists the cron jobs
> that are runnable under the apache web server user id, and gives you a
> button that you can push that will run that specific job on request.
> What this would enable is for anybody (or perhaps just ASF members?)
> to rerun the script at will.
>
>
No, specifically I disagree with this kind of approach.  If we're going to
follow an automatic deploy pattern, the application needs to be self
healing.  If there's manual intervention something's wrong.  Yesterday we
got stuck because:

- SVN repo list is cached in the app (but not in cron)
- The cron jobs happened to run in such an order that the svn update
happened after public_podlings.json was generated

What I'm actually now thinking is that we should implement two bug fixes:

- in ASF::SVN if the repo doesn't exist, reload the repository.yml file and
try again.
- in some scripts, make sure that any repos you need are up to date (rather
than rely on the svn update cron)

Thoughts?


> I'd start by parsing the whimsy_server/manifests/cronjobs.pp.
>
> > John
> >
> > On Sat, Jun 10, 2017 at 8:58 AM John D. Ament <jo...@apache.org>
> wrote:
> >
> >> I just pushed up a change that requires a change to the podlings.json
> >> file.  However, that file is only generated in cron.  I was wondering
> if it
> >> made sense that on deploy, to run a script that would regenerate the
> static
> >> files?
> >>
> >> John
>
> - Sam Ruby
>

Re: Regenerating static content on deploy?

Posted by Sam Ruby <ru...@intertwingly.net>.
On Sat, Jun 10, 2017 at 11:10 AM, John D. Ament <jo...@apache.org> wrote:
> Unless someone objects, I'm going to introduce a shell script that executes
> the contents of the cron jobs.  I'm then going to raise a PR to infra to
> update this section
> https://github.com/apache/infrastructure-puppet/blob/deployment/modules/whimsy_server/manifests/init.pp#L70
> to
> include a call to said script.

Not an objection, but thinking out loud.

Some of those cronjobs are lightning fast.  Others may take a while
and consume network or CPU resources.

For the fast ones, I find it mildly amusing that they use whimsy/asf
library functions and capture the result in JSON, and you are parsing
the JSON instead of using the same library functions.  What this means
is that you get stale data when you have ready access to fresher data.

For the slow ones, I don't think it is appropriate to run then after every push.

What I would suggest instead is a CGI script that lists the cron jobs
that are runnable under the apache web server user id, and gives you a
button that you can push that will run that specific job on request.
What this would enable is for anybody (or perhaps just ASF members?)
to rerun the script at will.

I'd start by parsing the whimsy_server/manifests/cronjobs.pp.

> John
>
> On Sat, Jun 10, 2017 at 8:58 AM John D. Ament <jo...@apache.org> wrote:
>
>> I just pushed up a change that requires a change to the podlings.json
>> file.  However, that file is only generated in cron.  I was wondering if it
>> made sense that on deploy, to run a script that would regenerate the static
>> files?
>>
>> John

- Sam Ruby

Re: Regenerating static content on deploy?

Posted by "John D. Ament" <jo...@apache.org>.
Unless someone objects, I'm going to introduce a shell script that executes
the contents of the cron jobs.  I'm then going to raise a PR to infra to
update this section
https://github.com/apache/infrastructure-puppet/blob/deployment/modules/whimsy_server/manifests/init.pp#L70
to
include a call to said script.

John

On Sat, Jun 10, 2017 at 8:58 AM John D. Ament <jo...@apache.org> wrote:

> I just pushed up a change that requires a change to the podlings.json
> file.  However, that file is only generated in cron.  I was wondering if it
> made sense that on deploy, to run a script that would regenerate the static
> files?
>
> John
>