You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Bill Tutt <ra...@lyra.org> on 2002/09/19 16:04:13 UTC

#739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

I've added the following comment to issue #739. For some reason the
comment hasn't showed up on the issues email list yet:

In order to ensure our I of Isolation in Acid, we need to have a 
guaranteed way of being able to detect processes that died without 
cleaning up after themselves. 

One way of doing this is to follow the guidelines in this URL:
http://www.sleepycat.com/docs/ref/env/faq.html and create a watcher 
process.

Another would be to move all code that called libsvn_fs into a separate 
process.

I think the watcher process is the simplest approach. It'd work 
something like this:

When the watcher process starts up, it's assumed the machine is 
starting, and you're garunteed that no other programs are accessing 
the BDB store. Therefore, the watcher process recovers the store on 
startup.

Before libsvn_fs opens the BDB store, it registers the current process 
with the watcher process. if this fails, libsvn_fs returns a failure.
(This 
code should be a thread safe ref count for the process from libsvn_fs's 
end.)

After libsvn_fs closes the BDB store, it notifies the watcher process
that 
it has released the BDB store cleanly. (Again, this should be a thread 
safe ref count.)

If the watcher process detects an exiting registered process that hasn't

deregistered then the datastore is now suspect. The watcher process 
must now cause all in process transactions to be aborted.

This should probably be accomplished by using some asyncrhonous 
notification + timeout. If the timeout expires before the other 
remaining processes exit out, then the watcher process may kill the 
process explicitly.

Once all of the registered processes have either exited with a useful 
failure message, or forcefully killed, then the watcher is allowed to 
recover the datastore.

Any incoming registration requests must block until the database has 
been successfully recovered.

It's almost a shame that the watcher process can't release just the 
locks that were owned by the errant process because the process has 
exited.

If we could, then life would be much simpler.

FYI,
Bill


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Nuutti Kotivuori <na...@iki.fi>.

Garrett Rooney wrote:
> mark benedetto king wrote:
>> On Thu, Sep 19, 2002 at 03:01:24PM -0400, Garrett Rooney wrote:
>>> speaking of ra_pipe, when do we get to see some more of that code
>>> in the tree? ;-)

[...]

> i just think it would be a 'very good thing (tm)' to have the work
> that's being done on ra_pipe being done in the tree, rather than on
> someone's personal machine where it can easily get lost...

Hear! Hear!

Let us see the code, maybe we can even help making this so very
important piece of code.

:-)

-- Naked

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

mark benedetto king wrote:
> On Thu, Sep 19, 2002 at 03:01:24PM -0400, Garrett Rooney wrote:
> 
>>speaking of ra_pipe, when do we get to see some more of that code in the 
>>tree? ;-)
>>
> 
> 
> Well, it will be much easier to produce piecemeal, (more) easily reviewed
> patches if the XMLRPC-EPI and pipe-management stuff were already committed
> (which they would need to be, for a BDB wrapper to work).

well, you've got commit access to work on ra_pipe...  if you're planning 
on using xmlrpc-epi for that, then i don't see what's wrong with you 
adding that to the tree (either support for us linking against an 
installed version or a copy in our tree if that's what you're planning 
on doing).

i just think it would be a 'very good thing (tm)' to have the work 
that's being done on ra_pipe being done in the tree, rather than on 
someone's personal machine where it can easily get lost...

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by mark benedetto king <bk...@inquira.com>.

On Thu, Sep 19, 2002 at 03:01:24PM -0400, Garrett Rooney wrote:
> 
> speaking of ra_pipe, when do we get to see some more of that code in the 
> tree? ;-)
> 

Well, it will be much easier to produce piecemeal, (more) easily reviewed
patches if the XMLRPC-EPI and pipe-management stuff were already committed
(which they would need to be, for a BDB wrapper to work).

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

mark benedetto king wrote:
> On Thu, Sep 19, 2002 at 10:24:19AM -0700, Bill Tutt wrote:
> 
>>>You are seriously proposing a root process that kills other processes?
>>>Including my Subversion aware editor which happens to be accessing the
>>>repository?  And my debugging session which happens to have reused the
>>>crashed Subversion client process ID?  This is a "completely and
>>>utterly robust" solution?
>>>
>>
>>Well, we must do something. 
>>
>>Alternative suggestions that do solve the problem are certainly
>>welcomed.
>>
>>Bill
>>
> 
> 
> Well, what oracle does (AFAICT) is this:
> 
> 1.) clients connect to a socket (AF_UNIX preferred, AF_INET okay).

i don't think apr currently supports AF_UNIX, although people have 
talked about adding that support.

> 2.) the listening on the socket forks of a child to service
> this connection, which is used for an entire transaction (and
> possibly more than one transactionj).
> 
> 3.) the child then reads queries from the client, executes them,
> and writes the results back.  If the client dies, the child detects
> this easily, and rollsback the outstanding transaction, releases
> locks, and exits.
> 
> The code for this child process is extremely thoroughly QAed
> so that it, itself, is unlikely to fail without releasing the
> locks that it holds.  Further, the listener mentioned in (2)
> can detect when this (extremely rare) failure happens, and DTRT.
> 
> I think this may be the easiest safe strategy (though it does
> require a daemon) (which might be auto-started, as mentioned
> before).

i would very much prefer that it be auto started...  having to configure 
a daemon for a local repository is just too much to ask from users.

> It is easy to only start one instance of the daemon: only one
> process can bind to a particular socket at once.
> 
> I don't think it needs to be setuid; it doesn't matter who owns
> the daemon (any local user with write access can own the daemon).
> One could argue that there are some security problems with this
> approach (a local user could create a hostile server to attempt
> to exploit a client-side vulnerability).  If this is a concern,
> then make the repo owned by svn and mode 600 and start the
> daemon ahead of time, or make a setuid daemon-starter.
> 
> Of course, this requires that an API be marshalled
> into flattened queries" and responses so that this communication
> can take place over the socket connection.
> 
>>From a QA standpoint it might be easiest to do it by making
> an RPC-style wrapper around the BDB API.  We can even use
> the xmlrpc-epi library for marshalling of the requests;
> I've done the majority of the work necessary for this already
> for ra_pipe; I even have an IDL compiler. :-)
> 
> Unless there are objections, I can probably throw this together
> in a day or two.

i'd love to see this.  i'm not positive it's the best plan, but it 
sounds like a viable option to provide if nothing else.

speaking of ra_pipe, when do we get to see some more of that code in the 
tree? ;-)

-garrett


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Justin Erenkrantz <je...@apache.org>.

On Fri, Sep 20, 2002 at 01:50:16PM -0400, mark benedetto king wrote:
> There are several different possibilities.  One approach that works well
> for database access is to use a "connection pool".  When a thread wants

apr_reslist_* in apr-util is meant for exactly this.  

I haven't used it, but I know that is what this is for.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by mark benedetto king <bk...@inquira.com>.

On Fri, Sep 20, 2002 at 06:27:31PM +0100, Philip Martin wrote:
> mark benedetto king <bk...@Inquira.Com> writes:
> 
> > In effect, ra_dav -> NetBDB would *leverage* the MPM features of
> > apache; the fewer times apache forks, the fewer NetBDB services
> > would be required.
> 
> Really, I'd have thought you need at least one NetBDB per thread.  Are
> you proposing that threads share a NetBDB connection?  Assuming you
> really mean one per thread, how often does Apache create and destroy
> threads?
> 
> Using one NetBDB per thread will dramatically increase the number of
> proceses, which kind of negates any benefits of using threads instead
> of processes.
> 

There are several different possibilities.  One approach that works well
for database access is to use a "connection pool".  When a thread wants
to opens a connection, it first looks in the pool to see if any
previously established connections are available.  If not, it creates
a new one (unless the total number of outstanding connections has
exceeded a configurable threshold, in which case it blocks until
a new connection becomes available).  In any case, it uses the connection
it obtains, and when done, places it into the list of available
connections.  Frequently this list has a configurable maximum size;
if more than X connections are already available, this one is
simply closed.

Another mechanism would be to share connections between threads.  This
requires significantly more synchronization than the above approach, so
at high levels of concurrency, performance will degrade significantly.

Blending these two approaches is another possibility (i.e., share
N connections between M threads, M > N).

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

mark benedetto king <bk...@Inquira.Com> writes:

> In effect, ra_dav -> NetBDB would *leverage* the MPM features of
> apache; the fewer times apache forks, the fewer NetBDB services
> would be required.

Really, I'd have thought you need at least one NetBDB per thread.  Are
you proposing that threads share a NetBDB connection?  Assuming you
really mean one per thread, how often does Apache create and destroy
threads?

Using one NetBDB per thread will dramatically increase the number of
proceses, which kind of negates any benefits of using threads instead
of processes.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by mark benedetto king <bk...@inquira.com>.

On Fri, Sep 20, 2002 at 10:57:47AM +0100, Philip Martin wrote:
> 
> A standard Unix fork/exec server.

Right.

> 
> > The code for this child process is extremely thoroughly QAed
> 
> That's a red herring, all Subversion code is reviewed.  Expecting, or
> relying on, one part to be "thoroughly QAed" doesn't really help.
> 

You're right, but it would still be nice if the server were small
enough to be easy to review completely, so that it is unlikely
to be crashing on its own.  In particular, apache + mod_dav +
mod_dav_svn + libsvn_fs do not really meet this requirement, IMO.

> > so that it, itself, is unlikely to fail without releasing the
> > locks that it holds.  Further, the listener mentioned in (2)
> > can detect when this (extremely rare) failure happens, and DTRT.
> > 

This feature means that even if the small, simple server does
crash, things will be cleaned up.  We're not, as you suggest,
"relying on" the stability of the service.

> > I think this may be the easiest safe strategy (though it does
> > require a daemon) (which might be auto-started, as mentioned
> > before).
> 
> What worries me about this proposal is the performance impact on
> mod_dav_svn.  We already have a sophisticated server, Apache, where
> the fork/exec stuff has been abstracted into the MPM modules.  Adding
> this new server imposes the fork/exec model again.  Will this degrade
> (Windows?) servers?

This is a very valid concern, and one I would expect from anyone with
experience with apache.  However, my belief is that the two services
(HTTP and "NetBDB") are very different in *connection lifetime*.

HTTP connection lifespan: milliseconds (hopefully)
    (ignoring keep-alive optimizations)

    This fact is why HTTP *must* have an MPM infrastructure.

NetBDB connection lifetime: lifetime of client program.
    ra_local: seconds (or maybe tenths of a second, one day)
    ra_dav: lifetime of apache instance

    This fact is why NetBDB can probably get away without one.

In effect, ra_dav -> NetBDB would *leverage* the MPM features of
apache; the fewer times apache forks, the fewer NetBDB services
would be required.

Let me reiterate that this is *exactly* the model that Oracle uses;
that doesn't make it right, but they seem to be doing reasonably well
so far.

> 
> Aside from the multi-processing issue, I'm also concerned about memory
> usage.  Everything into and out of the database now requires memory to
> be allocated in both Apache and the new server.  There is also the
> overhead of sending the requests over the local socket.
> 

Yes; this is the price you pay for isolation.  I'm not sure that, from
a performance standpoint, it makes sense to not pre-fetch additional
records, but, at least in theory, the service would only need enough
memory to hold the largest record in any of the tables.  It could
conceivable even stream *this* record to the client.  Practically,
I think it makes sense to start with the easiest implementation; with
an API established, the service and the protocol between the client
and service can be optimized at our leisure.

> Now, ra_pipe will require some sort of svnd server, but that should be
> handling ra requests.
> 

Actually, ra_pipe itself doesn't require any daemons (though typically
it will be used with ssh, which requires sshd).

Typically, ra_pipe runs a command like "ssh user@remotehost svnpipe".
Then ra_pipe communicates with this sub-process.  All of the
daemon bits are taken care of by sshd.

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

Sander Striker wrote:
>>From: Garrett Rooney [mailto:rooneg@electricjellyfish.net]
>>Sent: 20 September 2002 15:26
> 
> 
>>Philip Martin wrote:
>>
>>
>>>>so that it, itself, is unlikely to fail without releasing the
>>>>locks that it holds.  Further, the listener mentioned in (2)
>>>>can detect when this (extremely rare) failure happens, and DTRT.
>>>>
>>>>I think this may be the easiest safe strategy (though it does
>>>>require a daemon) (which might be auto-started, as mentioned
>>>>before).
>>>
>>>
>>>What worries me about this proposal is the performance impact on
>>>mod_dav_svn.  We already have a sophisticated server, Apache, where
>>>the fork/exec stuff has been abstracted into the MPM modules.  Adding
>>>this new server imposes the fork/exec model again.  Will this degrade
>>>(Windows?) servers?
>>>
>>>Aside from the multi-processing issue, I'm also concerned about memory
>>>usage.  Everything into and out of the database now requires memory to
>>>be allocated in both Apache and the new server.  There is also the
>>>overhead of sending the requests over the local socket.
>>>
>>>Now, ra_pipe will require some sort of svnd server, but that should be
>>>handling ra requests.
>>
>>nothing says that this theoretical server has to be used for ALL 
>>filesystem access...  (well, maybe bill wants it to be, but personally, 
>>i don't thing it's a 'hard and fast requirement')  we could have it only 
>>used for mediating access to the repository for ra_local, and then build 
>>enough smarts into the apache module to deal with similar problems (like 
>>philip theorized about in another email).
> 
> 
> Sure it needs to be ALL access.  Consider mixed environments where both
> ra_local and ra_dav are used... (and maybe other ra_ layers).

ack, good point, my bad.

-garrett





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Sander Striker <st...@apache.org>.

> From: Garrett Rooney [mailto:rooneg@electricjellyfish.net]
> Sent: 20 September 2002 15:26

> Philip Martin wrote:
> 
> >>so that it, itself, is unlikely to fail without releasing the
> >>locks that it holds.  Further, the listener mentioned in (2)
> >>can detect when this (extremely rare) failure happens, and DTRT.
> >>
> >>I think this may be the easiest safe strategy (though it does
> >>require a daemon) (which might be auto-started, as mentioned
> >>before).
> > 
> > 
> > What worries me about this proposal is the performance impact on
> > mod_dav_svn.  We already have a sophisticated server, Apache, where
> > the fork/exec stuff has been abstracted into the MPM modules.  Adding
> > this new server imposes the fork/exec model again.  Will this degrade
> > (Windows?) servers?
> > 
> > Aside from the multi-processing issue, I'm also concerned about memory
> > usage.  Everything into and out of the database now requires memory to
> > be allocated in both Apache and the new server.  There is also the
> > overhead of sending the requests over the local socket.
> > 
> > Now, ra_pipe will require some sort of svnd server, but that should be
> > handling ra requests.
> 
> nothing says that this theoretical server has to be used for ALL 
> filesystem access...  (well, maybe bill wants it to be, but personally, 
> i don't thing it's a 'hard and fast requirement')  we could have it only 
> used for mediating access to the repository for ra_local, and then build 
> enough smarts into the apache module to deal with similar problems (like 
> philip theorized about in another email).

Sure it needs to be ALL access.  Consider mixed environments where both
ra_local and ra_dav are used... (and maybe other ra_ layers).

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

Philip Martin wrote:

>>so that it, itself, is unlikely to fail without releasing the
>>locks that it holds.  Further, the listener mentioned in (2)
>>can detect when this (extremely rare) failure happens, and DTRT.
>>
>>I think this may be the easiest safe strategy (though it does
>>require a daemon) (which might be auto-started, as mentioned
>>before).
> 
> 
> What worries me about this proposal is the performance impact on
> mod_dav_svn.  We already have a sophisticated server, Apache, where
> the fork/exec stuff has been abstracted into the MPM modules.  Adding
> this new server imposes the fork/exec model again.  Will this degrade
> (Windows?) servers?
> 
> Aside from the multi-processing issue, I'm also concerned about memory
> usage.  Everything into and out of the database now requires memory to
> be allocated in both Apache and the new server.  There is also the
> overhead of sending the requests over the local socket.
> 
> Now, ra_pipe will require some sort of svnd server, but that should be
> handling ra requests.

nothing says that this theoretical server has to be used for ALL 
filesystem access...  (well, maybe bill wants it to be, but personally, 
i don't thing it's a 'hard and fast requirement')  we could have it only 
used for mediating access to the repository for ra_local, and then build 
enough smarts into the apache module to deal with similar problems (like 
philip theorized about in another email).

-garrett


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

mark benedetto king <bk...@Inquira.Com> writes:

> 1.) clients connect to a socket (AF_UNIX preferred, AF_INET okay).
> 
> 2.) the listening on the socket forks of a child to service
> this connection, which is used for an entire transaction (and
> possibly more than one transactionj).
> 
> 3.) the child then reads queries from the client, executes them,
> and writes the results back.  If the client dies, the child detects
> this easily, and rollsback the outstanding transaction, releases
> locks, and exits.

A standard Unix fork/exec server.

> The code for this child process is extremely thoroughly QAed

That's a red herring, all Subversion code is reviewed.  Expecting, or
relying on, one part to be "thoroughly QAed" doesn't really help.

> so that it, itself, is unlikely to fail without releasing the
> locks that it holds.  Further, the listener mentioned in (2)
> can detect when this (extremely rare) failure happens, and DTRT.
> 
> I think this may be the easiest safe strategy (though it does
> require a daemon) (which might be auto-started, as mentioned
> before).

What worries me about this proposal is the performance impact on
mod_dav_svn.  We already have a sophisticated server, Apache, where
the fork/exec stuff has been abstracted into the MPM modules.  Adding
this new server imposes the fork/exec model again.  Will this degrade
(Windows?) servers?

Aside from the multi-processing issue, I'm also concerned about memory
usage.  Everything into and out of the database now requires memory to
be allocated in both Apache and the new server.  There is also the
overhead of sending the requests over the local socket.

Now, ra_pipe will require some sort of svnd server, but that should be
handling ra requests.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by mark benedetto king <bk...@inquira.com>.

On Thu, Sep 19, 2002 at 10:24:19AM -0700, Bill Tutt wrote:
> > 
> > You are seriously proposing a root process that kills other processes?
> > Including my Subversion aware editor which happens to be accessing the
> > repository?  And my debugging session which happens to have reused the
> > crashed Subversion client process ID?  This is a "completely and
> > utterly robust" solution?
> > 
> 
> Well, we must do something. 
> 
> Alternative suggestions that do solve the problem are certainly
> welcomed.
> 
> Bill
> 

Well, what oracle does (AFAICT) is this:

1.) clients connect to a socket (AF_UNIX preferred, AF_INET okay).

2.) the listening on the socket forks of a child to service
this connection, which is used for an entire transaction (and
possibly more than one transactionj).

3.) the child then reads queries from the client, executes them,
and writes the results back.  If the client dies, the child detects
this easily, and rollsback the outstanding transaction, releases
locks, and exits.

The code for this child process is extremely thoroughly QAed
so that it, itself, is unlikely to fail without releasing the
locks that it holds.  Further, the listener mentioned in (2)
can detect when this (extremely rare) failure happens, and DTRT.

I think this may be the easiest safe strategy (though it does
require a daemon) (which might be auto-started, as mentioned
before).

It is easy to only start one instance of the daemon: only one
process can bind to a particular socket at once.

I don't think it needs to be setuid; it doesn't matter who owns
the daemon (any local user with write access can own the daemon).
One could argue that there are some security problems with this
approach (a local user could create a hostile server to attempt
to exploit a client-side vulnerability).  If this is a concern,
then make the repo owned by svn and mode 600 and start the
daemon ahead of time, or make a setuid daemon-starter.

Of course, this requires that an API be marshalled
into flattened queries" and responses so that this communication
can take place over the socket connection.

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by "Glenn A. Thompson" <gt...@cdr.net>.

Hey,

Pardon my ignorance here but the recovery procedure on sleepycat says this:

"2. It is necessary to recover information after system or application
failure. In this case, recovery processing must be performed on any database
environments that were active at the time of the failure. Recovery processing
involves running the db_recover utility or calling the DB_ENV->open method
with the DB_RECOVER or DB_RECOVER_FATAL flags.

During recovery processing, all database changes made by aborted or unfinished
transactions are undone, and all database changes made by committed
transactions are redone, as necessary. Database applications must not be
restarted until recovery completes. After recovery finishes, the environment
is properly initialized so that applications may be restarted."

I took this to be DB_ENV->open(DB_RECOVER|etc )  == db_recover

svnadmin has a call to recover which is not listed in the help and is
currently compiled out.  In this call he aquires an exclusive lock on db.lock.

It has this comment above it.
"      /* ### TODO: Get this working with new libsvn_repos API.  We need
     the repos API to access the lockfile paths and such, but we
     apparently don't want the locking that comes along with the repos
     API. */
    case svnadmin_cmd_recover:"

Is this code being permanently abandoned?
Should the FS api nolonger include a recover function?

"
/* Perform any necessary non-catastrophic recovery on a Berkeley
   DB-based Subversion filesystem, stored in the environment PATH.  Do
...
  any necessary allocation within POOL.
  */

svn_error_t *svn_fs_berkeley_recover (const char *path,
                                      apr_pool_t *pool);"


Just trying to get a handle on this.

Thanks,
gat








---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

"Glenn A. Thompson" <gt...@cdr.net> writes:

> > One thing we need to do is ensure that the BDB recovery process is
> > robust.  The documentation requires that no other process is using the
> > database when you run recover.  At the moment we don't have a way to
> > ensure that.  What we need is a filesystem lock in the db directory,
> > such that when it is present svn_repos_open fails. Then the recovery
> > process is
> 
> I thought that db.lock was used for this purpose.

No idea, I'm not a BDB expert.

> I thought svnadmin aquires an exclusive lock on  for recovery and repos
> grabs a shared lock on open.

svnadmin doesn't do recovery, we run db_recover for that.  I was
proposing to add an svnadmin command that does a bit more checking.

This is the behaviour I observe at present.

$ svn import URL path
^C

The database is now "stuck".  If I was running another svn command at
the same time, then it hangs.  If I start a new command that also
hangs.

$ svn ls URL
This hangs.

Now I run db_recover.  It doesn't block.  So the recovery runs despite
the hung processes.  This violates the recommended recovery procedure
in the BDB documentation.  While I can check and kill processes before
running recover, I cannot guarantee that another will not start. Hence
the procedure

$ svnadmin lock repo
Now no new processes can open the database.

$ ps
Kill any existing processes.

$ svnadmin recover repo
This is the only process accessing the database.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by "Glenn A. Thompson" <gt...@cdr.net>.

Hey:

>
> One thing we need to do is ensure that the BDB recovery process is
> robust.  The documentation requires that no other process is using the
> database when you run recover.  At the moment we don't have a way to
> ensure that.  What we need is a filesystem lock in the db directory,
> such that when it is present svn_repos_open fails. Then the recovery
> process is
>

I thought that db.lock was used for this purpose.
I thought svnadmin aquires an exclusive lock on  for recovery and repos
grabs a shared lock on open.

gat


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

mark benedetto king <bk...@Inquira.Com> writes:

> > Lock the repository so that new processes fail to open it
> > $ svnadmin lock /path/to/repos
> > 
> > Now check for existing processes that are using the DB
> > $ ps
> > $ lsof
> > $ kill xxxx
> > $ kill -9 xxxx
> > 
> > Now run BDB recovery and clear the lock
> > $ svnadmin recover /path/to/repository
> 
> What happens when svnadmin crashes after obtaining a lock?

It probably means you need to catastrophic BDB recovery.

> You've got a stale lock file.
> 
> If you handle stale lock files by rm'ing them, we're back
> into a lock-stealing scenario (how do you really know the
> lock is stale?)
> 
> A user can know that no one else is mucking around in his WC.
> 
> An administrator frequently isn't quite so sure that no one
> else is working on *exactly the same problem*.

I was assuming that it would be like

$ svnadmin lock repo
$
OK, I can work on fixing this.

$ svnadmin lock repo
svnadmin: error: already locked
$
Oh! Someone else is doing something.

> It's a secure recovery process, but it's a manual recovery process.
> Personally, I don't want to have to run the command sequence above
> every time someone hits ^C on their client.  I'd much rather the
> recovery process only be needed in the case of power-outage.

Yes, but we are going to fix the client to handle ^C.  I don't want
BDB recovery to run at all, whether manually or automatically, if
someone hits ^C, as that involves other clients failing.

> I think that requiring manual locking, ps-ing, kill-ing, recovering, etc
> does not meet this definition of robustness.

Well, I disagree :)

I think the plans I have seen so far, to automatically kill clients,
are far less desirable.  I haven't seen one that looks to be
particularly robust.

BDB recovery won't need to be done that often.  If it is we've got
something wrong.

Finally bear in mind that depending on what has happened, BDB recovery
may fail.  In which case you need to do catastrophic recovery, which
will almost certainly require manual intervention.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Bill Tutt <ra...@lyra.org>.

> From: mark benedetto king [mailto:bking@inquira.com]
> On Fri, Sep 20, 2002 at 11:29:12AM +0100, Philip Martin wrote:
> >
> >
> > Now suppose you want also want to run BDB recovery automatically.  I
> > probably would not do that myself, but no matter.  Can we use Apache
> > to do this?  I'm not an Apache expert, but it does have a
controlling
> > process that remains in communication with it's children.  Could we
> > provide an Apache module, or a mod_dav_svn directive, that causes
> > Apache to detect children that disappear by dumping core, or
children
> > that hang and become unresponsive?  Then Apache could then lock the
> > repository to block new children, terminate any existing mod_dav_svn
> > children and finally run repository recovery.
> >
> > Then, to have a system that automatically recovers a locked database
> > you run Apache and only allow access through ra_dav.
> >
> > Is this possible?  Does it satisfy your ACID requirements?
> >
> 
> IANAAE, E. :-)
> 
> If we only allow access through ra_dav and all of those things can
> be accomplished reliably with apache, then yes, I think it satisfies
> them.
> 

Although currently we're not setup to only allow access through ra_dav.
If we did really do that then, yes we'd satisfy the ACID requirements.

Current examples in Subversion where we'd currently violate that
principle:
* svnadmin
* svnlook
etc...

Bill



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by mark benedetto king <bk...@inquira.com>.

On Fri, Sep 20, 2002 at 11:29:12AM +0100, Philip Martin wrote:
> 
> One thing we need to do is ensure that the BDB recovery process is
> robust.  The documentation requires that no other process is using the
> database when you run recover.  At the moment we don't have a way to
> ensure that.  What we need is a filesystem lock in the db directory,
> such that when it is present svn_repos_open fails. Then the recovery
> process is
> 
> Lock the repository so that new processes fail to open it
> $ svnadmin lock /path/to/repos
> 
> Now check for existing processes that are using the DB
> $ ps
> $ lsof
> $ kill xxxx
> $ kill -9 xxxx
> 
> Now run BDB recovery and clear the lock
> $ svnadmin recover /path/to/repository

What happens when svnadmin crashes after obtaining a lock?

You've got a stale lock file.

If you handle stale lock files by rm'ing them, we're back
into a lock-stealing scenario (how do you really know the
lock is stale?)

A user can know that no one else is mucking around in his WC.

An administrator frequently isn't quite so sure that no one
else is working on *exactly the same problem*.

We'd need a reference-counted lock file, which means we'd need
svnadmin not to exit.

So, you'd have something like:

$ svnadmin lock /path/to/repos
bash(svnadmin)> ps
bash(svnadmin)> lsof
bash(svnadmin)> kill xxxx
bash(svnadmin)> kill -9 xxxx
bash(svnadmin)> svnadmin recover /path/to/repository
bash(svnadmin)> exit

Then, if we're really careful about POSIX flock() semantics,
we could guarantee that 

    1.) no two svnadmins are running at the same time
    2.) no new connections are created after the svnadmin runs

This would probably be a lot less effort than a "NetBDB" implementation,
and obviously would not adversely affect performance, etc.   Also, this
functionality would be required for recovery after system crashes.

> 
> That provides a secure recovery process, in the face of Subversion
> clients and servers.  Obviously a user could write a program that

It's a secure recovery process, but it's a manual recovery process.
Personally, I don't want to have to run the command sequence above
every time someone hits ^C on their client.  I'd much rather the
recovery process only be needed in the case of power-outage.

> bypasses svn_repos_open if they have sufficient OS/filesystem access,
> but then they can also use raw BDB calls, normal stdio, or a normal
> editor!  If you are concerned about such cases, they are handled by
> the usual OS security measures.

Anyone who does those things deserves what they get.  Actually, they
deserve worse. :-)

> At this stage I would argue that we have a "completly and utterly
> robust" system.  Once a transaction has been committed it is secure,
> it will never be lost.

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

"Bill Tutt" <ra...@lyra.org> writes:

> Well, we must do something. 
> 
> Alternative suggestions that do solve the problem are certainly
> welcomed.

One thing we need to do is ensure that the BDB recovery process is
robust.  The documentation requires that no other process is using the
database when you run recover.  At the moment we don't have a way to
ensure that.  What we need is a filesystem lock in the db directory,
such that when it is present svn_repos_open fails. Then the recovery
process is

Lock the repository so that new processes fail to open it
$ svnadmin lock /path/to/repos

Now check for existing processes that are using the DB
$ ps
$ lsof
$ kill xxxx
$ kill -9 xxxx

Now run BDB recovery and clear the lock
$ svnadmin recover /path/to/repository

That provides a secure recovery process, in the face of Subversion
clients and servers.  Obviously a user could write a program that
bypasses svn_repos_open if they have sufficient OS/filesystem access,
but then they can also use raw BDB calls, normal stdio, or a normal
editor!  If you are concerned about such cases, they are handled by
the usual OS security measures.

At this stage I would argue that we have a "completly and utterly
robust" system.  Once a transaction has been committed it is secure,
it will never be lost.

Now suppose you want also want to run BDB recovery automatically.  I
probably would not do that myself, but no matter.  Can we use Apache
to do this?  I'm not an Apache expert, but it does have a controlling
process that remains in communication with it's children.  Could we
provide an Apache module, or a mod_dav_svn directive, that causes
Apache to detect children that disappear by dumping core, or children
that hang and become unresponsive?  Then Apache could then lock the
repository to block new children, terminate any existing mod_dav_svn
children and finally run repository recovery.

Then, to have a system that automatically recovers a locked database
you run Apache and only allow access through ra_dav.

Is this possible?  Does it satisfy your ACID requirements?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

Philip Martin <ph...@codematters.co.uk> writes:

> > Alternative suggestions that do solve the problem are certainly
> > welcomed.
> 
> It depends on how you define "completely and utterly robust" :-)
> 
> Having all the clients receive SVN_ERR_REPOS_RECOVER and print
> "repository /foo/bar/repo requires recovery" and relying on the
> repository administrator running db_recover would do for me.

Having experimented a bit, it appears the clients just hang when there
is a repository problem.  At least that's what happens at present.  So
it may be difficult to generate an error.  That doesn't really affect
my view of the solution, I still think manual intervention is the best
solution.  It may be that a hanging client is what alerts the user,
rather than an explicit error message.

> At present, ra_local never loses data once it has been committed, it
> is "completely and utterly robust".  All I want is to minimise the
> circumstances that cause SVN_ERR_REPOS_RECOVER.  But when it does
> occur I am happy with manual intervention.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

"Bill Tutt" <ra...@lyra.org> writes:

> > You are seriously proposing a root process that kills other processes?
> > Including my Subversion aware editor which happens to be accessing the
> > repository?  And my debugging session which happens to have reused the
> > crashed Subversion client process ID?  This is a "completely and
> > utterly robust" solution?
> 
> Well, we must do something. 
> 
> Alternative suggestions that do solve the problem are certainly
> welcomed.

It depends on how you define "completely and utterly robust" :-)

Having all the clients receive SVN_ERR_REPOS_RECOVER and print
"repository /foo/bar/repo requires recovery" and relying on the
repository administrator running db_recover would do for me.

At present, ra_local never loses data once it has been committed, it
is "completely and utterly robust".  All I want is to minimise the
circumstances that cause SVN_ERR_REPOS_RECOVER.  But when it does
occur I am happy with manual intervention.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Bill Tutt <ra...@lyra.org>.

> From: Philip Martin [mailto:philip@codematters.co.uk]
> 
> "Bill Tutt" <ra...@lyra.org> writes:
> 
> > > how does the watcher process kill the other processes?  are we
going
> to
> > > install it with the appropriate privs so that it can kill them?
setuid
> > > binaries give me the creeps...
> >
> > Yes, it needs appropriate privs so that it can kill them if
necessary.
> > We can come up with various schemes to tell the client process to
exit
> > ASAP, but I think that the watcher process still needs to be able to
> > forcefully kill the client applications.
> 
> You are seriously proposing a root process that kills other processes?
> Including my Subversion aware editor which happens to be accessing the
> repository?  And my debugging session which happens to have reused the
> crashed Subversion client process ID?  This is a "completely and
> utterly robust" solution?
> 

Well, we must do something. 

Alternative suggestions that do solve the problem are certainly
welcomed.

Bill


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Philip Martin <ph...@codematters.co.uk>.

"Bill Tutt" <ra...@lyra.org> writes:

> > how does the watcher process kill the other processes?  are we going to
> > install it with the appropriate privs so that it can kill them? setuid
> > binaries give me the creeps...
> 
> Yes, it needs appropriate privs so that it can kill them if necessary.
> We can come up with various schemes to tell the client process to exit
> ASAP, but I think that the watcher process still needs to be able to
> forcefully kill the client applications.

You are seriously proposing a root process that kills other processes?
Including my Subversion aware editor which happens to be accessing the
repository?  And my debugging session which happens to have reused the
crashed Subversion client process ID?  This is a "completely and
utterly robust" solution?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Bill Tutt <ra...@lyra.org>.

> From: Garrett Rooney [mailto:rooneg@electricjellyfish.net]
> 
> Bill Tutt wrote:
> 
> > If the watcher process detects an exiting registered process that
hasn't
> >
> > deregistered then the datastore is now suspect. The watcher process
> > must now cause all in process transactions to be aborted.
> >
> > This should probably be accomplished by using some asyncrhonous
> > notification + timeout. If the timeout expires before the other
> > remaining processes exit out, then the watcher process may kill the
> > process explicitly.
> >
> > Once all of the registered processes have either exited with a
useful
> > failure message, or forcefully killed, then the watcher is allowed
to
> > recover the datastore.
> 
> how does the watcher process kill the other processes?  are we going
to
> install it with the appropriate privs so that it can kill them?
setuid
> binaries give me the creeps...
> 

Yes, it needs appropriate privs so that it can kill them if necessary.
We can come up with various schemes to tell the client process to exit
ASAP, but I think that the watcher process still needs to be able to
forcefully kill the client applications.

> also, would we have to have the watcher start up at system start?
that
> adds quite a bit of overhead to the process of creating a repository.
> perhaps it would be possible to have the first svn process that tries
to
> access the repository start the watcher...
> 

This is the easiest way to kick start the watcher up. 

Well, if the svn process itself started the watcher, then yeah, the
binary would need to be setuid. Ick. I'd rather it not be setuid, and
just run in the correct user security context. (Not to mention fun
portability issues if we use setuid.)

Bill



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

Bill Tutt wrote:

> If the watcher process detects an exiting registered process that hasn't
> 
> deregistered then the datastore is now suspect. The watcher process 
> must now cause all in process transactions to be aborted.
> 
> This should probably be accomplished by using some asyncrhonous 
> notification + timeout. If the timeout expires before the other 
> remaining processes exit out, then the watcher process may kill the 
> process explicitly.
> 
> Once all of the registered processes have either exited with a useful 
> failure message, or forcefully killed, then the watcher is allowed to 
> recover the datastore.

how does the watcher process kill the other processes?  are we going to 
install it with the appropriate privs so that it can kill them?  setuid 
binaries give me the creeps...

also, would we have to have the watcher start up at system start?  that 
adds quite a bit of overhead to the process of creating a repository. 
perhaps it would be possible to have the first svn process that tries to 
access the repository start the watcher...

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by Bill Tutt <ra...@lyra.org>.

Well, Windows can just wait for the process to exit using some kind of
WaitForMultipleObject() scheme. We'd probably need some other scheme for
Unix boxes given how useless process APIs are on Unix. i.e. There isn't
a openpid() API that lets you then call waitpid().

Suggestions for an alternate scheme would be appreciated.

Bill
----
Do you want a dangerous fugitive staying in your flat?
No.
Well, don't upset him and he'll be a nice fugitive staying in your flat.
 

> -----Original Message-----
> From: Glenn A. Thompson [mailto:gthompson@cdr.net]
> Sent: Thursday, September 19, 2002 9:39 AM
> To: Subversion Dev list
> Subject: Re: #739: Ensuring ACID in Subversion (aka watcher
procecesses
> are fun)
> 
> >
> > If the watcher process detects an exiting registered process that
hasn't
> >
> 
> How is he detecting this?
> Are you thinking about some sort of lease scheme?
> gat
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

Posted by "Glenn A. Thompson" <gt...@cdr.net>.

>
> If the watcher process detects an exiting registered process that hasn't
>

How is he detecting this?
Are you thinking about some sort of lease scheme?
gat


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org