You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Troy Curtis Jr <tr...@gmail.com> on 2007/10/14 02:16:21 UTC

Scalability of the new merge-tracking

Has anyone been explicitly testing the scalability of the new
merge-tracking features?  I'm sure that utilizing sqlite on the
server-side goes a long way to keeping performance up, but it does
seem that my team's code tends to push svn performance (well some
people's are certainly larger than mine but, eh).  Just curious if
there has been any explicit tests.

-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Scalability of the new merge-tracking

Posted by Troy Curtis Jr <tr...@gmail.com>.
On 10/15/07, David Glasser <gl...@davidglasser.net> wrote:
> On 10/15/07, Troy Curtis Jr <tr...@gmail.com> wrote:
> > On 10/14/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > Can you define "scalability"?  That's a pretty vague word.  :-)
> > >
> > > On 10/13/07, Troy Curtis Jr <tr...@gmail.com> wrote:
> > > > Has anyone been explicitly testing the scalability of the new
> > > > merge-tracking features?  I'm sure that utilizing sqlite on the
> > > > server-side goes a long way to keeping performance up, but it does
> > > > seem that my team's code tends to push svn performance (well some
> > > > people's are certainly larger than mine but, eh).  Just curious if
> > > > there has been any explicit tests.
> > > >
> > > > --
> > > > "Beware of spyware. If you can, use the Firefox browser." - USA Today
> > > > Download now at http://getfirefox.com
> > > > Registered Linux User #354814 ( http://counter.li.org/)
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > > > For additional commands, e-mail: dev-help@subversion.tigris.org
> > > >
> > > >
> > >
> >
> > Too true, too true.  Really my question was just out there to see if
> > anyone had done any testing toward defining/quantifying the merge
> > tracking scalability.  For instance, I had to move away from FSFS as
> > my backend because it didn't scale well to my deep and numerous
> > directory structure (at first I thought it was number of revs >65k but
> > others have more than that, then thought maybe it was size >2GB, but
> > others have MUCH larger than that).  Of course it was probably be
> > pretty difficult to concoct a very large repo with lots of complicated
> > merges in order to test this concept.
> >
> > I guess it probably isn't that useful of a question to ask now that I
> > sit back and think about it.
>
> Out of curiousity, Troy, what did you move to?  Back to BDB?
>
> I'm actively working on improving the scalability of FSFS (I committed
> a few patches this week to cache revision data in RAM more
> aggressively), though I haven't looked at merge-tracking specifically
> yet.  (Note also that most of these caches are most effective over
> svnserve or over DAV with Apache tuned to make sure that all requests
> for a given user command go to the same child.)
>
> --dave
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>

Yes I did go with BDB (and it has caused me a few headaches to be
sure), but it seems to do very well.  But when I was evaluating which
back-end to use I noticed the big differences were in checkout and
export.  Someone pointed out that those activities happen relatively
infrequently and so they might not be the best reason to choose BDB
over FSFS.  I did agree with them but then I found a deal-breaker.  It
turns out that I needed to do some pretty frequent hot-copies to
support remote disconnected development (using a set of support
scripts) off of removable hard-drives.  It turns out that coping > 60k
files of ANYTHING takes a ridiculously long time (~40 minutes if I
remember right).

I think that one of the main performance hits with FSFS was digging
through all those individual files (YES I know about skip-revisions!
:) ).  About the only way I could see mitigating this ( and my
hot-copy issue)  is to do the equivalent of "git pack".  Basically
issue a command that will pack together N revisions into a single
"meta-revision" file and seek around inside that to find your
revisions.  Of course it probably wouldn't/shouldn't be just one file,
but a set of meta-files.  Perhaps broken into files that were <2GB, or
a max of M revisions or something like that. Of course that sounds
like no trivial task to me.

Out of the dozen or so repositories that I admin at work, only one
(the big one) is using BDB.  It really is a pain to have that external
dependency for a particular BDB version, especially on Redhat
Enterprise Linux 4 (grrrrr).  But it is great to have that choice!

Troy
-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Scalability of the new merge-tracking

Posted by David Glasser <gl...@davidglasser.net>.
On 10/15/07, Troy Curtis Jr <tr...@gmail.com> wrote:
> On 10/14/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > Can you define "scalability"?  That's a pretty vague word.  :-)
> >
> > On 10/13/07, Troy Curtis Jr <tr...@gmail.com> wrote:
> > > Has anyone been explicitly testing the scalability of the new
> > > merge-tracking features?  I'm sure that utilizing sqlite on the
> > > server-side goes a long way to keeping performance up, but it does
> > > seem that my team's code tends to push svn performance (well some
> > > people's are certainly larger than mine but, eh).  Just curious if
> > > there has been any explicit tests.
> > >
> > > --
> > > "Beware of spyware. If you can, use the Firefox browser." - USA Today
> > > Download now at http://getfirefox.com
> > > Registered Linux User #354814 ( http://counter.li.org/)
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > > For additional commands, e-mail: dev-help@subversion.tigris.org
> > >
> > >
> >
>
> Too true, too true.  Really my question was just out there to see if
> anyone had done any testing toward defining/quantifying the merge
> tracking scalability.  For instance, I had to move away from FSFS as
> my backend because it didn't scale well to my deep and numerous
> directory structure (at first I thought it was number of revs >65k but
> others have more than that, then thought maybe it was size >2GB, but
> others have MUCH larger than that).  Of course it was probably be
> pretty difficult to concoct a very large repo with lots of complicated
> merges in order to test this concept.
>
> I guess it probably isn't that useful of a question to ask now that I
> sit back and think about it.

Out of curiousity, Troy, what did you move to?  Back to BDB?

I'm actively working on improving the scalability of FSFS (I committed
a few patches this week to cache revision data in RAM more
aggressively), though I haven't looked at merge-tracking specifically
yet.  (Note also that most of these caches are most effective over
svnserve or over DAV with Apache tuned to make sure that all requests
for a given user command go to the same child.)

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Scalability of the new merge-tracking

Posted by Troy Curtis Jr <tr...@gmail.com>.
On 10/14/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> Can you define "scalability"?  That's a pretty vague word.  :-)
>
> On 10/13/07, Troy Curtis Jr <tr...@gmail.com> wrote:
> > Has anyone been explicitly testing the scalability of the new
> > merge-tracking features?  I'm sure that utilizing sqlite on the
> > server-side goes a long way to keeping performance up, but it does
> > seem that my team's code tends to push svn performance (well some
> > people's are certainly larger than mine but, eh).  Just curious if
> > there has been any explicit tests.
> >
> > --
> > "Beware of spyware. If you can, use the Firefox browser." - USA Today
> > Download now at http://getfirefox.com
> > Registered Linux User #354814 ( http://counter.li.org/)
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > For additional commands, e-mail: dev-help@subversion.tigris.org
> >
> >
>

Too true, too true.  Really my question was just out there to see if
anyone had done any testing toward defining/quantifying the merge
tracking scalability.  For instance, I had to move away from FSFS as
my backend because it didn't scale well to my deep and numerous
directory structure (at first I thought it was number of revs >65k but
others have more than that, then thought maybe it was size >2GB, but
others have MUCH larger than that).  Of course it was probably be
pretty difficult to concoct a very large repo with lots of complicated
merges in order to test this concept.

I guess it probably isn't that useful of a question to ask now that I
sit back and think about it.

-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Scalability of the new merge-tracking

Posted by Ben Collins-Sussman <su...@red-bean.com>.
Can you define "scalability"?  That's a pretty vague word.  :-)

On 10/13/07, Troy Curtis Jr <tr...@gmail.com> wrote:
> Has anyone been explicitly testing the scalability of the new
> merge-tracking features?  I'm sure that utilizing sqlite on the
> server-side goes a long way to keeping performance up, but it does
> seem that my team's code tends to push svn performance (well some
> people's are certainly larger than mine but, eh).  Just curious if
> there has been any explicit tests.
>
> --
> "Beware of spyware. If you can, use the Firefox browser." - USA Today
> Download now at http://getfirefox.com
> Registered Linux User #354814 ( http://counter.li.org/)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org