You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whimsical.apache.org by sebb <se...@gmail.com> on 2020/07/12 16:33:42 UTC

Refreshing SVN from private repos

On Sun, 12 Jul 2020 at 02:42, Sam Ruby <ru...@intertwingly.net> wrote:
>
> On Sat, Jul 11, 2020 at 8:26 PM sebb <se...@gmail.com> wrote:
> >
> > However there is an issue in testing, in that SVN is only updated
> > every 10 minutes, so the workspace won't show the updated files
> > immediately.
>
> Almost, but not quite, a year ago, you asked this question:
>
> https://lists.apache.org/thread.html/64cfda89066f658480f0997517555609ca06649d6971808bfba01c61%40%3Cusers.infra.apache.org%3E
>
> A little over two months ago I mentioned that there was progress:
>
> "The infrastructure team has already enabled pubsub for LDAP data, and
> is working on pubsub for private svn repositories. "
>
> https://lists.apache.org/thread.html/r5a9b6b61a9b4300d9e40003c58f9e87ac34cbe8dada7eaf2f3b02fbd%40%3Cdev.whimsical.apache.org%3E
>
> Search #whimsy in slack for "pubsub" in April of this year for more status.
>
> We have an existing cron job using pubsub to watch for changes in the
> whimsy source code and, when a change occurs, initiates a puppet run.
> These updates complement but don't replace the running of puppet by
> the infrastructure team.
>
> In the (possibly near) future, we could have different cron job using
> pubsub to watch for changes in svn repositories, match changes against
> repository.yml, and if found do a full or partial "rake svn:update".
> These updates could complement but not replace the running of svn
> update every 10 minutes.  If this were done, a number of mail
> subscriptions could be retired.
>
> I suggest two separate cron jobs as these two tasks would need to run
> under separate user ids.

Some possible alternatives:
- Whimsy knows when it has updated SVN, so it could send a message to
a server job asking for the relevant SVN workspace to be updated. This
would not catch external changes, but most of the work is done by
Whimsy now
- emeritus listings could be cached by Whimsy internally, rather than
as externally maintained listing files.
Given that the changes are infrequent, and the numbers of files are
likely to be small, it should not be too much of an overhead. Should
not be a problem with karma, as only members get to see the links
anyway.
- add more commit email subscriptions to catch updates to more busy
private repos

Re: Refreshing SVN from private repos

Posted by sebb <se...@gmail.com>.
On Sun, 12 Jul 2020 at 19:27, Sam Ruby <ru...@intertwingly.net> wrote:
>
> On Sun, Jul 12, 2020 at 12:33 PM sebb <se...@gmail.com> wrote:
> >
> > On Sun, 12 Jul 2020 at 02:42, Sam Ruby <ru...@intertwingly.net> wrote:
> > >
> > > On Sat, Jul 11, 2020 at 8:26 PM sebb <se...@gmail.com> wrote:
> > > >
> > > > However there is an issue in testing, in that SVN is only updated
> > > > every 10 minutes, so the workspace won't show the updated files
> > > > immediately.
> > >
> > > Almost, but not quite, a year ago, you asked this question:
> > >
> > > https://lists.apache.org/thread.html/64cfda89066f658480f0997517555609ca06649d6971808bfba01c61%40%3Cusers.infra.apache.org%3E
> > >
> > > A little over two months ago I mentioned that there was progress:
> > >
> > > "The infrastructure team has already enabled pubsub for LDAP data, and
> > > is working on pubsub for private svn repositories. "
> > >
> > > https://lists.apache.org/thread.html/r5a9b6b61a9b4300d9e40003c58f9e87ac34cbe8dada7eaf2f3b02fbd%40%3Cdev.whimsical.apache.org%3E
> > >
> > > Search #whimsy in slack for "pubsub" in April of this year for more status.
> > >
> > > We have an existing cron job using pubsub to watch for changes in the
> > > whimsy source code and, when a change occurs, initiates a puppet run.
> > > These updates complement but don't replace the running of puppet by
> > > the infrastructure team.
> > >
> > > In the (possibly near) future, we could have different cron job using
> > > pubsub to watch for changes in svn repositories, match changes against
> > > repository.yml, and if found do a full or partial "rake svn:update".
> > > These updates could complement but not replace the running of svn
> > > update every 10 minutes.  If this were done, a number of mail
> > > subscriptions could be retired.
> > >
> > > I suggest two separate cron jobs as these two tasks would need to run
> > > under separate user ids.
>
> I checked with Humbedooh on #asfinfra.  Apparently the code is
> complete, from a technical perspective all that is left to be done is
> to add the pubsub hook to our svn repositories and do final testing.
> If this is of interest, I encourage you to open a JIRA asking that
> this be done.

OK.

> > Some possible alternatives:
> > - Whimsy knows when it has updated SVN, so it could send a message to
> > a server job asking for the relevant SVN workspace to be updated. This
> > would not catch external changes, but most of the work is done by
> > Whimsy now
>
> The board agenda tool basically does this, and this is a fine thing to
> do.  FWIW, the board agenda tool does this by essentially touching a
> file in the file system which is being watched (using the listen gem).

Neat

> > - emeritus listings could be cached by Whimsy internally, rather than
> > as externally maintained listing files.
> > Given that the changes are infrequent, and the numbers of files are
> > likely to be small, it should not be too much of an overhead. Should
> > not be a problem with karma, as only members get to see the links
> > anyway.
>
> I'm not enthusiastic about this approach.  Having multiple users
> update a common cache can lead to odd problems.  I saw this all the
> time with the original whimsy-vm, which is why I enforced a strict
> separation starting with whimsy-vm2.
>
> Additionally, when it comes down to it, the entire /srv/svn directory
> is nothing more than a cache itself - all the data found in there
> could be obtained by a direct call to SVN.

The workspace is deliberately not owned by the webserver user.
So the workspace cache cannot be updated by Whimsy code.

> > - add more commit email subscriptions to catch updates to more busy
> > private repos
>
> This, too, is fine.  You've now gone through the process of setting up
> a new VM, so you know what it takes.  I would only add that at times
> mail gets backed up so it is a great thing when it works, but at times
> it makes things more confusing - once you get used to committing a
> change meaning almost immediate response you tend to forget that the
> communication channel may be subject to unpredictable delays and you
> wonder why the change you made hasn't been picked up.
>
>  - - -
>
> I will add that in the case of the secretary workbench, there is
> another option.  There is an event stream opened between the client
> and server.  Any time you get a request you can start a thread and
> then immediately reply with potentially stale data.  The thread can
> get fresh data and, should it be different, send an event to the
> client.  The board agenda tool does this.
>
> - Sam Ruby

Re: Refreshing SVN from private repos

Posted by Sam Ruby <ru...@intertwingly.net>.
On Sun, Jul 12, 2020 at 12:33 PM sebb <se...@gmail.com> wrote:
>
> On Sun, 12 Jul 2020 at 02:42, Sam Ruby <ru...@intertwingly.net> wrote:
> >
> > On Sat, Jul 11, 2020 at 8:26 PM sebb <se...@gmail.com> wrote:
> > >
> > > However there is an issue in testing, in that SVN is only updated
> > > every 10 minutes, so the workspace won't show the updated files
> > > immediately.
> >
> > Almost, but not quite, a year ago, you asked this question:
> >
> > https://lists.apache.org/thread.html/64cfda89066f658480f0997517555609ca06649d6971808bfba01c61%40%3Cusers.infra.apache.org%3E
> >
> > A little over two months ago I mentioned that there was progress:
> >
> > "The infrastructure team has already enabled pubsub for LDAP data, and
> > is working on pubsub for private svn repositories. "
> >
> > https://lists.apache.org/thread.html/r5a9b6b61a9b4300d9e40003c58f9e87ac34cbe8dada7eaf2f3b02fbd%40%3Cdev.whimsical.apache.org%3E
> >
> > Search #whimsy in slack for "pubsub" in April of this year for more status.
> >
> > We have an existing cron job using pubsub to watch for changes in the
> > whimsy source code and, when a change occurs, initiates a puppet run.
> > These updates complement but don't replace the running of puppet by
> > the infrastructure team.
> >
> > In the (possibly near) future, we could have different cron job using
> > pubsub to watch for changes in svn repositories, match changes against
> > repository.yml, and if found do a full or partial "rake svn:update".
> > These updates could complement but not replace the running of svn
> > update every 10 minutes.  If this were done, a number of mail
> > subscriptions could be retired.
> >
> > I suggest two separate cron jobs as these two tasks would need to run
> > under separate user ids.

I checked with Humbedooh on #asfinfra.  Apparently the code is
complete, from a technical perspective all that is left to be done is
to add the pubsub hook to our svn repositories and do final testing.
If this is of interest, I encourage you to open a JIRA asking that
this be done.

> Some possible alternatives:
> - Whimsy knows when it has updated SVN, so it could send a message to
> a server job asking for the relevant SVN workspace to be updated. This
> would not catch external changes, but most of the work is done by
> Whimsy now

The board agenda tool basically does this, and this is a fine thing to
do.  FWIW, the board agenda tool does this by essentially touching a
file in the file system which is being watched (using the listen gem).

> - emeritus listings could be cached by Whimsy internally, rather than
> as externally maintained listing files.
> Given that the changes are infrequent, and the numbers of files are
> likely to be small, it should not be too much of an overhead. Should
> not be a problem with karma, as only members get to see the links
> anyway.

I'm not enthusiastic about this approach.  Having multiple users
update a common cache can lead to odd problems.  I saw this all the
time with the original whimsy-vm, which is why I enforced a strict
separation starting with whimsy-vm2.

Additionally, when it comes down to it, the entire /srv/svn directory
is nothing more than a cache itself - all the data found in there
could be obtained by a direct call to SVN.

> - add more commit email subscriptions to catch updates to more busy
> private repos

This, too, is fine.  You've now gone through the process of setting up
a new VM, so you know what it takes.  I would only add that at times
mail gets backed up so it is a great thing when it works, but at times
it makes things more confusing - once you get used to committing a
change meaning almost immediate response you tend to forget that the
communication channel may be subject to unpredictable delays and you
wonder why the change you made hasn't been picked up.

 - - -

I will add that in the case of the secretary workbench, there is
another option.  There is an event stream opened between the client
and server.  Any time you get a request you can start a thread and
then immediately reply with potentially stale data.  The thread can
get fresh data and, should it be different, send an event to the
client.  The board agenda tool does this.

- Sam Ruby