You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Phil Endecott <sp...@chezphil.org> on 2005/11/09 20:05:25 UTC
Update from post-commit hook
Dear All,
I have been thinking about using a post-commit hook to keep a public
"latest snapshot" of part of the repository up to date. This could be
as simple as putting
cd /latest/snapshot/; svn update
in the post-commit hook. I've done some quick experiments which have
not been very sucessful, so I have some questions:
- Where do error messages from the hook scripts go? (Using Apache.)
- Access is normally via Apache; is the nested call to svn OK, or does
/latest/snapshot/ need to be a file: checkout, or what? (One of my
experiments led to a runaway svn process, making me think that something
recursive was going on.)
- (Possibly related to the above:) I rquire HTTP AUTH for both read and
write to the repository. How can the apache user, who runs the hook
script, authenticate itself in the nested call?
- Only part of the repository is being tracked in this snapshot. I
could make the update conditional by checking if it has changed using
svnlook, maybe something like: "snvlook changed | grep -q something ||
svn update". But maybe an svn update when nothing has changed is just
as fast - any comments?
- I don't want this to slow down commits if I can help it. Is it OK to
background the hook script, i.e. to have "svn update &" in the
post-commit file?
Any suggestions much appreciated.
--Phil.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Update from post-commit hook
Posted by Dominic Anello <da...@danky.com>.
On 2005-11-15 08:05:55 -0700, Mark Parker wrote:
> Ryan Schmidt wrote:
> >It should be ok, but I would recommend using a file:/// checkout
> >instead because it will be faster. Make sure you have a FSFS
> >repository, as BDB repositories have problems with concurrent access
> >over different protocols.
>
> No, BDB doesn't have problems with multiple access methods (or at least
> any problems that FSFS doesn't have). You run into problems when process
> A creates files in the repository (usually logs for BDB) that are
> unreadable/unwriteable for process B. This is (as far as I understand)
> MORE likely to happen with FSFS, because EVERY COMMIT is GUARANTEED to
> create one or more new files in the repository, while BDB logfiles are
> only created when the last one fills up.
>
> I've taken my information from the book
> (http://svnbook.red-bean.com/nightly/en/svn.serverconfig.multimethod.html),
> and if I've misinterpreted something here, please feel free to correct me.
>
> Mark
BDB will create and log a transaction for reads as well as writes, so
it's possible for a "read only" operation such as svnlook or svnadmin
verify to generate a new log file. I've personally wedged my repository
by carelessly running verify as root, for example.
You can try it yourself:
[svn@lynx ~/ec-svn/repo/db]$ ls -l log.*
-rw-rw-r-- 1 svn svn 1046553 Nov 14 15:17 log.0000002124
-rw-rw-r-- 1 svn svn 941420 Nov 15 13:47 log.0000002125
[svn@lynx ~/ec-svn/repo/db]$ svnlook history .. /trunk > /dev/null
[svn@lynx ~/ec-svn/repo/db]$ ls -l log.*
-rw-rw-r-- 1 svn svn 1046553 Nov 14 15:17 log.0000002124
-rw-rw-r-- 1 svn svn 954652 Nov 15 13:56 log.0000002125
Note that log.0000002125 grew by about 13k just from a simple list. A
verify on a large repo can easily roll the log over many times.
It's hard to say which one is more prone to wedging due to permission
issues. Barring performance problems with one or the other, I think it
basically comes down to whatever one you are most comfortable with.
-Dominic
Re: Update from post-commit hook
Posted by Mark Parker <ma...@msdhub.com>.
Ryan Schmidt wrote:
> It should be ok, but I would recommend using a file:/// checkout
> instead because it will be faster. Make sure you have a FSFS
> repository, as BDB repositories have problems with concurrent access
> over different protocols.
No, BDB doesn't have problems with multiple access methods (or at least
any problems that FSFS doesn't have). You run into problems when process
A creates files in the repository (usually logs for BDB) that are
unreadable/unwriteable for process B. This is (as far as I understand)
MORE likely to happen with FSFS, because EVERY COMMIT is GUARANTEED to
create one or more new files in the repository, while BDB logfiles are
only created when the last one fills up.
I've taken my information from the book
(http://svnbook.red-bean.com/nightly/en/svn.serverconfig.multimethod.html),
and if I've misinterpreted something here, please feel free to correct me.
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Update from post-commit hook
Posted by Ryan Schmidt <su...@ryandesign.com>.
On Nov 9, 2005, at 22:28, Phil Endecott wrote:
>> I would definitely recommend spawning off another process
>> somehow, but simply "svn update &" may be too simplistic. If Bob
>> commits r100 at 12:00:00 and the svn update process gets spawned
>> at 12:00:01 and takes 10 seconds to complete, and Joe commits
>> r101 at 12:00:02 firing off another svn update process at
>> 12:00:03, then Joe's update process will probably throw an error
>> that the working copy is in use / dirty / whatever it says
>
> If the update instead waited for the working copy to be unlocked,
> rather than failing immediately, there wouldn't be a problem.
The exact message printed if you try to update a working copy twice
simultaneously is this:
svn: Working copy '.' locked
svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for
details)
The point is that a human must decide if it is appropriate to run svn
cleanup. It is not appropriate if there is another task updating the
working copy. It is appropriate if there merely was another such task
and it has crashed or terminated improperly.
If, as you suggest, svn update simply waited for the working copy to
be unlocked, it would wait all day if the previous update in fact had
a problem.
> Is this an issue only when the hook script is backgrounded, or does
> it also appy in the normal case?
I do not know how Subversion handles the timing of the situation
where two people try to commit different changes at about the same
time. I have a hunch that Subversion handles only one commit at a
time, so the problem would not exist if you did not background the task.
> Is there a lock somewhere that is not released until the post-
> commit hook has finished? I had assumed that the locks were
> released at the point of commit.
I'm talking about locks relating to the working copy, not the
repository. Whenever you say svn update (and probably other svn
commands), the working copy is locked to prevent another process from
trying to work on the same working copy. When the first task is done,
it's supposed to unlock the working copy again.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Update from post-commit hook
Posted by Phil Endecott <sp...@chezphil.org>.
Ryan Schmidt wrote:
> Phil Endecott wrote:
>> - Where do error messages from the hook scripts go? (Using Apache.)
> ...error messages go nowhere.
Not a great decision, I must say. How hard would it be to have them
appear in Apache's error log?
>> - I don't want this to slow down commits if I can help it. Is it OK
>> to background the hook script, i.e. to have "svn update &" in the
>> post-commit file?
>
> I would definitely recommend spawning off another process somehow, but
> simply "svn update &" may be too simplistic. If Bob commits r100 at
> 12:00:00 and the svn update process gets spawned at 12:00:01 and takes
> 10 seconds to complete, and Joe commits r101 at 12:00:02 firing off
> another svn update process at 12:00:03, then Joe's update process will
> probably throw an error that the working copy is in use / dirty /
> whatever it says
If the update instead waited for the working copy to be unlocked, rather
than failing immediately, there wouldn't be a problem. Is this an issue
only when the hook script is backgrounded, or does it also appy in the
normal case? Is there a lock somewhere that is not released until the
post-commit hook has finished? I had assumed that the locks were
released at the point of commit.
Cheers,
--Phil.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Update from post-commit hook
Posted by Ryan Schmidt <su...@ryandesign.com>.
On Nov 9, 2005, at 21:05, Phil Endecott wrote:
> Dear All,
>
> I have been thinking about using a post-commit hook to keep a
> public "latest snapshot" of part of the repository up to date.
> This could be as simple as putting
>
> cd /latest/snapshot/; svn update
The hook script runs with no environment. You'll need to call any
executables by their complete path, e.g. /usr/bin/svn. You can test
your hook script on the command line by running something like env -
i /path/to/hooks/post-commit . This is useful for debugging because...
> in the post-commit hook. I've done some quick experiments which
> have not been very sucessful, so I have some questions:
>
> - Where do error messages from the hook scripts go? (Using Apache.)
...error messages go nowhere. You can redirect them to a log file if
you so desire, like
cd /latest/snapshot
/usr/bin/svn update 2>/path/to/svnupdate.log
> - Access is normally via Apache; is the nested call to svn OK, or
> does /latest/snapshot/ need to be a file: checkout, or what? (One
> of my experiments led to a runaway svn process, making me think
> that something recursive was going on.)
It should be ok, but I would recommend using a file:/// checkout
instead because it will be faster. Make sure you have a FSFS
repository, as BDB repositories have problems with concurrent access
over different protocols.
> - (Possibly related to the above:) I rquire HTTP AUTH for both read
> and write to the repository. How can the apache user, who runs the
> hook script, authenticate itself in the nested call?
You would need to figure out where the Apache user's home is, set up
a .subversion directory with cached authentication credentials
inside. It'll be easier if you just use a file:/// checkout and
bypass all that.
> - Only part of the repository is being tracked in this snapshot. I
> could make the update conditional by checking if it has changed
> using svnlook, maybe something like: "snvlook changed | grep -q
> something || svn update". But maybe an svn update when nothing has
> changed is just as fast - any comments?
In the script I wrote I first got a list of all paths changed by the
commit, then I had a mapping of snapshot working copies to repository
paths, and updated only the relevant working copies. I'm not sure if
this ended up being faster or slower than just updating everything
all the time. The problem in our setup was (and continues to be) the
RAM cache, which keeps getting thrown out and used for other things.
In testing, everything is very quick because data about the working
copy files remains in the server's cache. As soon as it's in
production, though, and our 10 developers are using Apache and
sending mails and everything else, the cache thrashes about and it
takes forever to do things.
> - I don't want this to slow down commits if I can help it. Is it
> OK to background the hook script, i.e. to have "svn update &" in
> the post-commit file?
I would definitely recommend spawning off another process somehow,
but simply "svn update &" may be too simplistic. If Bob commits r100
at 12:00:00 and the svn update process gets spawned at 12:00:01 and
takes 10 seconds to complete, and Joe commits r101 at 12:00:02 firing
off another svn update process at 12:00:03, then Joe's update process
will probably throw an error that the working copy is in use /
dirty / whatever it says, and when all's said and done, by 12:00:15,
the snapshot will be up to date with r100 but will never update to
r101—not until Sally commits r102 at 15:34:00.
While I have thought of these issues, I have not yet programmed the
correct solution to them; if someone else has gone that extra mile,
I'd sure like to hear what the best solution is.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org