You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Michael A. Olson" <ma...@sleepycat.com> on 2001/02/22 00:29:21 UTC

Berkeley DB in Subversion

Hi,

I'm Mike Olson.  I work at Sleepycat Software.  We develop,
distribute, and support Berkeley DB.

I came across this excerpt on your Web site, and wanted to
follow up with you about it:

     SQL Back-End

     The Subversion filesystem will probably use Berkeley DB to store 
     data on disk; however, a real SQL database provides much more 
     reliable transactions.  Someone can rewrite the filesystem back-end 
     to speak SQL. 

Needless to say, we disagree pretty strongly about the reliable
transactions statement.  Berkeley DB survives failure without loss
of data, and without corruption.  We're a bunch of database
heavyweights with significant time at the big relational companies.
We use the same techniques that the other vendors do for transactions,
including two-phase locking and write-ahead logging.  I don't think
that a relational client/server system would be more reliable, but
I'm certain that it would be slower.

Have you had problems with Berkeley DB that led you to make that
statement?  If there's a problem, we'd like to know more so that
we can help you fix it.

If you're in deployment now with Berkeley DB in Milestone 1, we'd
like to include you on our open source partners page.  We've
recently redesigned the Web site, and will put up a new page in
the next few weeks that lists the open source projects that rely
on Berkeley DB.  We'd be glad to have Subversion on that list,
if you're willing.

Please do let me know about the reliability issue.

					mike

Re: Berkeley DB in Subversion

Posted by "Michael A. Olson" <ma...@sleepycat.com>.
At 07:26 PM 2/21/01 -0800, Greg Stein wrote:

> > I don't think
> > that a relational client/server system would be more reliable,
> 
> Agreed.
> 
> > but I'm certain that it would be slower.
> 
> Not sure that I agree with the generalization, but it *is* generally true
:-)

The company line is that BDB is faster than a relational server
because you have to go out of process to fetch a record from a
standalone server, but with Berkeley DB, the IPC turns into a
function call.  Of course, I'm speculating about the performance
of a relational server implementaiton that doesn't exist, so we
don't need this nail hammered all the way into the wood :-).

> As a sign of good faith :-), I've updated the documentation (in CVS). It
> doesn't appear to automatically propagate to the web site, but Karl can see
> that it happens.

Thanks.  I appreciate it.

> Berkeley DB isn't used in M1, but will appear in M2. We don't have a current
> date for that, but I'd say sometime in March. As I mentioned before, it is
> more than just M1... we plan to ship it that way. We currently require at
> least 3.1.14, but I imagine that we'd want to upgrade that to your 3.2
> releases before our final release.

If you have questions or problems during the implementation, you should
let us know.  Email to support@sleepycat.com will get our attention.
Be sure to mention that you're working on Subversion.  I do encourage
you to make 3.2.9 the basis for the release, since newest code has the
most known bugs fixed.  3.1.14 and 3.1.17 are in wide deployment now,
and work fine, but if you get to choose, go for 3.2.

Jim Blandy wrote:

> Subversion is not yet deployed, but when it is, I think it would be
> great to have it on Sleepycat's site, assuming the other developers
> don't object.

I'll make a note to ping you again in a couple of months.  Thanks a
lot for the quick response on the Web site.

					mike

Re: Berkeley DB in Subversion

Posted by Greg Stein <gs...@lyra.org>.
On Wed, Feb 21, 2001 at 04:29:21PM -0800, Michael A. Olson wrote:
> Hi,
> 
> I'm Mike Olson.  I work at Sleepycat Software.  We develop,
> distribute, and support Berkeley DB.

Hi Mike!

> I came across this excerpt on your Web site, and wanted to
> follow up with you about it:
> 
>      SQL Back-End
> 
>      The Subversion filesystem will probably use Berkeley DB to store 
>      data on disk; however, a real SQL database provides much more 
>      reliable transactions.  Someone can rewrite the filesystem back-end 
>      to speak SQL.
> 
> Needless to say, we disagree pretty strongly about the reliable
> transactions statement.

I wouldn't take it too strongly. I'm not sure who or when it was written,
but we're quite happy with using Berkeley DB and plan to ship Subversion
with only BDB support. A future release will enable a pluggable database
backend, but it certainly isn't our top priority.

I'd agree that the statement is improper. The only real advantage that I
know SQL has over BDB is its relational query support.

>...
> I don't think
> that a relational client/server system would be more reliable,

Agreed.

> but I'm certain that it would be slower.

Not sure that I agree with the generalization, but it *is* generally true :-)

> Have you had problems with Berkeley DB that led you to make that
> statement?  If there's a problem, we'd like to know more so that
> we can help you fix it.

None at all. As I said, I'm not sure when/where the statement came from, but
it is entirely possible that somebody unfamiliar with available database
technology wrote it.

As a sign of good faith :-), I've updated the documentation (in CVS). It
doesn't appear to automatically propagate to the web site, but Karl can see
that it happens.

> If you're in deployment now with Berkeley DB in Milestone 1, we'd
> like to include you on our open source partners page.  We've
> recently redesigned the Web site, and will put up a new page in
> the next few weeks that lists the open source projects that rely
> on Berkeley DB.  We'd be glad to have Subversion on that list,
> if you're willing.

Berkeley DB isn't used in M1, but will appear in M2. We don't have a current
date for that, but I'd say sometime in March. As I mentioned before, it is
more than just M1... we plan to ship it that way. We currently require at
least 3.1.14, but I imagine that we'd want to upgrade that to your 3.2
releases before our final release.

Karl Fogel and Brian Behlendorf are the "official" guys who can state our
willingness to be on your partners page. Personally speaking, I'd love for
Subversion to be there!

> Please do let me know about the reliability issue.

We have none. An oversight, which has been corrected.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Berkeley DB in Subversion

Posted by Karl Fogel <kf...@galois.collab.net>.
Heh heh.  Regarding the weird text (now corrected by Greg Stein) that
claimed sql databases had more reliable transactions than Berkeley DB,
I recently wrote this to Brian Behlendorf in a private mail:

> He's quite right to complain.  I don't know what crack someone was
> smoking to claim that Berkeley DB has unreliable transactions.

Then, perhaps unconsciously suspecting the worst, I ran 

   $ cvs annotate -r 1.10 future.texi

on the file to see who *had* written it.  Of course, you already know
where this is going: it was me!

My apologies for the baseless libel, Michael :-).  I honestly have no
memory what I was thinking.

-K

Jim Blandy <ji...@zwingli.cygnus.com> writes:
> Michael, I did the Subversion filesystem design, and chose Berkeley DB
> for this implementation, precisely for its recoverability and
> transaction support.  I have no idea who put that bit on the web site.
> If someone knows of problems with Berkeley DB's recoverability, I'm
> also very interested in hearing about them.
> 
> As written, the web text seems to make a superstitious association
> between reliability and the query language a database uses.  It's kind
> of embarrassing.  I'd like to see it either substantiated, or revised.
> 
> Subversion is not yet deployed, but when it is, I think it would be
> great to have it on Sleepycat's site, assuming the other developers
> don't object.
> 
> 
> > I'm Mike Olson.  I work at Sleepycat Software.  We develop,
> > distribute, and support Berkeley DB.
> > 
> > I came across this excerpt on your Web site, and wanted to
> > follow up with you about it:
> > 
> >      SQL Back-End
> > 
> >      The Subversion filesystem will probably use Berkeley DB to store 
> >      data on disk; however, a real SQL database provides much more 
> >      reliable transactions.  Someone can rewrite the filesystem back-end 
> >      to speak SQL. 
> > 
> > Needless to say, we disagree pretty strongly about the reliable
> > transactions statement.  Berkeley DB survives failure without loss
> > of data, and without corruption.  We're a bunch of database
> > heavyweights with significant time at the big relational companies.
> > We use the same techniques that the other vendors do for transactions,
> > including two-phase locking and write-ahead logging.  I don't think
> > that a relational client/server system would be more reliable, but
> > I'm certain that it would be slower.
> > 
> > Have you had problems with Berkeley DB that led you to make that
> > statement?  If there's a problem, we'd like to know more so that
> > we can help you fix it.
> > 
> > If you're in deployment now with Berkeley DB in Milestone 1, we'd
> > like to include you on our open source partners page.  We've
> > recently redesigned the Web site, and will put up a new page in
> > the next few weeks that lists the open source projects that rely
> > on Berkeley DB.  We'd be glad to have Subversion on that list,
> > if you're willing.
> > 
> > Please do let me know about the reliability issue.
> > 
> > 					mike

Re: Berkeley DB in Subversion

Posted by Jim Blandy <ji...@zwingli.cygnus.com>.
Michael, I did the Subversion filesystem design, and chose Berkeley DB
for this implementation, precisely for its recoverability and
transaction support.  I have no idea who put that bit on the web site.
If someone knows of problems with Berkeley DB's recoverability, I'm
also very interested in hearing about them.

As written, the web text seems to make a superstitious association
between reliability and the query language a database uses.  It's kind
of embarrassing.  I'd like to see it either substantiated, or revised.

Subversion is not yet deployed, but when it is, I think it would be
great to have it on Sleepycat's site, assuming the other developers
don't object.


> I'm Mike Olson.  I work at Sleepycat Software.  We develop,
> distribute, and support Berkeley DB.
> 
> I came across this excerpt on your Web site, and wanted to
> follow up with you about it:
> 
>      SQL Back-End
> 
>      The Subversion filesystem will probably use Berkeley DB to store 
>      data on disk; however, a real SQL database provides much more 
>      reliable transactions.  Someone can rewrite the filesystem back-end 
>      to speak SQL. 
> 
> Needless to say, we disagree pretty strongly about the reliable
> transactions statement.  Berkeley DB survives failure without loss
> of data, and without corruption.  We're a bunch of database
> heavyweights with significant time at the big relational companies.
> We use the same techniques that the other vendors do for transactions,
> including two-phase locking and write-ahead logging.  I don't think
> that a relational client/server system would be more reliable, but
> I'm certain that it would be slower.
> 
> Have you had problems with Berkeley DB that led you to make that
> statement?  If there's a problem, we'd like to know more so that
> we can help you fix it.
> 
> If you're in deployment now with Berkeley DB in Milestone 1, we'd
> like to include you on our open source partners page.  We've
> recently redesigned the Web site, and will put up a new page in
> the next few weeks that lists the open source projects that rely
> on Berkeley DB.  We'd be glad to have Subversion on that list,
> if you're willing.
> 
> Please do let me know about the reliability issue.
> 
> 					mike