You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Matthias Junker <ma...@students.unibe.ch> on 2007/10/22 10:50:12 UTC

Speed up consecutive operations

Hello,

i'm working on a client for subversion, which builds a model of a 
repository in smalltalk. for now i'm just accessing the client api. but 
the problem is, that i need to call read only commands like ls,diff,etc 
very often in a row. of course i use the same ctx and pool for all these 
operations, but they are still very slow (like 1-2s per operation). now 
my question: would i have to use the repository access layer directly to 
speed up consecutive operations (by using the same session over and over 
again?), or is there a way to do this using the client api? Or did i get 
it all wrong? i would be really glad for any kind of hint, because i 
just started developing with subversion and it's hard for me to estimate 
the effort for this.

Cheers,
Matt

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speed up consecutive operations

Posted by Dan Christian <dc...@google.com>.
My benchmarks show that SVN can be 5X faster than DAV-neon for diff.
It's pretty easy to test.  You should be able to try the SVN protocol
just by changing URLS (assuming svnserve is running for the repository
and access is set up).

It's bit harder is to switch between neon and serf at run time
(requires running trunk and configured with both).  The code is
something like:
servers = apr_hash_get(ctx->config, SVN_CONFIG_CATEGORY_SERVERS,
                               APR_HASH_KEY_STRING);
svn_config_set(servers, SVN_CONFIG_SECTION_GLOBAL,
                           SVN_CONFIG_OPTION_HTTP_LIBRARY, "serf");

100 files is pretty typical.  Things get a lot slower as you get to
thousands of files.

-Dan C

On 10/22/07, Matthias Junker <ma...@students.unibe.ch> wrote:
> DAV-neon. Then you think this causes a lot of the slow-down?
>
> the size of the repository varies, but it should be able to deal with
> repositories with large amounts of files.
>
> the repository i ran the performance test on was rather small. only
> ~50-100 files.
>
> as far as i saw, svn_sleep_for_timestamps is only used in checkout,
> cleanup, commit, copy, diff, revert, switch and update.
> since i only use diff, this might be worth to consider, but the other
> commands don't concern me.
>
> Matt
>
>
> > He may also be running afoul of svn_sleep_for_timestamps.  Using the RA
> > API would avoid that penalty.
> >
> > On Mon, 2007-10-22 at 10:42 -0700, Dan Christian wrote:
> >
> >> What access protocol are you using?
> >>
> >> DAV-neon is slowest, SVN is fastest, DAV-serf is in between.
> >> Encryption slows things down even more.
> >>
> >> How big are the queries?
> >>
> >> Large file and directories counts will reduce performance.
> >>
> >> -Dan C

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speed up consecutive operations

Posted by Matthias Junker <ma...@students.unibe.ch>.
DAV-neon. Then you think this causes a lot of the slow-down?

the size of the repository varies, but it should be able to deal with
repositories with large amounts of files.

the repository i ran the performance test on was rather small. only 
~50-100 files.

as far as i saw, svn_sleep_for_timestamps is only used in checkout, 
cleanup, commit, copy, diff, revert, switch and update.
since i only use diff, this might be worth to consider, but the other 
commands don't concern me.

Matt


> He may also be running afoul of svn_sleep_for_timestamps.  Using the RA
> API would avoid that penalty.
>
> On Mon, 2007-10-22 at 10:42 -0700, Dan Christian wrote:
>   
>> What access protocol are you using?
>>
>> DAV-neon is slowest, SVN is fastest, DAV-serf is in between.
>> Encryption slows things down even more.
>>
>> How big are the queries?
>>
>> Large file and directories counts will reduce performance.
>>
>> -Dan C
>>
>> On 10/22/07, Matthias Junker <ma...@students.unibe.ch> wrote:
>>     
>>> Hello,
>>>
>>> i'm working on a client for subversion, which builds a model of a
>>> repository in smalltalk. for now i'm just accessing the client api. but
>>> the problem is, that i need to call read only commands like ls,diff,etc
>>> very often in a row. of course i use the same ctx and pool for all these
>>> operations, but they are still very slow (like 1-2s per operation). now
>>> my question: would i have to use the repository access layer directly to
>>> speed up consecutive operations (by using the same session over and over
>>> again?), or is there a way to do this using the client api? Or did i get
>>> it all wrong? i would be really glad for any kind of hint, because i
>>> just started developing with subversion and it's hard for me to estimate
>>> the effort for this.
>>>
>>> Cheers,
>>> Matt
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
>> For additional commands, e-mail: dev-help@subversion.tigris.org
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speed up consecutive operations

Posted by Greg Hudson <gh...@MIT.EDU>.
He may also be running afoul of svn_sleep_for_timestamps.  Using the RA
API would avoid that penalty.

On Mon, 2007-10-22 at 10:42 -0700, Dan Christian wrote:
> What access protocol are you using?
> 
> DAV-neon is slowest, SVN is fastest, DAV-serf is in between.
> Encryption slows things down even more.
> 
> How big are the queries?
> 
> Large file and directories counts will reduce performance.
> 
> -Dan C
> 
> On 10/22/07, Matthias Junker <ma...@students.unibe.ch> wrote:
> > Hello,
> >
> > i'm working on a client for subversion, which builds a model of a
> > repository in smalltalk. for now i'm just accessing the client api. but
> > the problem is, that i need to call read only commands like ls,diff,etc
> > very often in a row. of course i use the same ctx and pool for all these
> > operations, but they are still very slow (like 1-2s per operation). now
> > my question: would i have to use the repository access layer directly to
> > speed up consecutive operations (by using the same session over and over
> > again?), or is there a way to do this using the client api? Or did i get
> > it all wrong? i would be really glad for any kind of hint, because i
> > just started developing with subversion and it's hard for me to estimate
> > the effort for this.
> >
> > Cheers,
> > Matt
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speed up consecutive operations

Posted by Dan Christian <dc...@google.com>.
What access protocol are you using?

DAV-neon is slowest, SVN is fastest, DAV-serf is in between.
Encryption slows things down even more.

How big are the queries?

Large file and directories counts will reduce performance.

-Dan C

On 10/22/07, Matthias Junker <ma...@students.unibe.ch> wrote:
> Hello,
>
> i'm working on a client for subversion, which builds a model of a
> repository in smalltalk. for now i'm just accessing the client api. but
> the problem is, that i need to call read only commands like ls,diff,etc
> very often in a row. of course i use the same ctx and pool for all these
> operations, but they are still very slow (like 1-2s per operation). now
> my question: would i have to use the repository access layer directly to
> speed up consecutive operations (by using the same session over and over
> again?), or is there a way to do this using the client api? Or did i get
> it all wrong? i would be really glad for any kind of hint, because i
> just started developing with subversion and it's hard for me to estimate
> the effort for this.
>
> Cheers,
> Matt

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Speed up consecutive operations

Posted by "Miller, Eric" <Er...@amd.com>.
> No Eric, you're not wrong.  I was simply suggesting it as an example
of a
> client which batches up commands to execute over an svn_ra_session_t
> (thought it might provide some inspiration).
> 
> That's a good clarification, though.  Matt, while that particular
piece of
> code does perform several different types of RA calls, it's primarily
> batching all user-visible functions into a single RA call.
> 
> Regardless, the approach you were considering is similar.

Yep, I basically wrote my own version of mucc using perl bindings before
knowing of its existence.  I'm still glad I did because now I have to do
the same thing for an update editor.

Eric


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: Speed up consecutive operations

Posted by "Daniel L. Rall" <dl...@finemaltcoding.com>.
On Mon, 22 Oct 2007, Miller, Eric wrote:

> > Matt, you might want to look at svnmucc, which is an alternate
> command-
> > line
> > client which runs multiple commands at a time:
> > http://svn.collab.net/repos/svn/trunk/contrib/client-side/svnmucc/
> > 
> > While using the RA APIs yourself will save some overhead, I'm
> uncertain
> > that
> > it'll give you a noticable speed-up in terms of wall-clock time.
> 
> I wasn't aware that mucc handled things like ls, diff, etc.  Just
> standard stuff you can do with a commit-editor.  Am I wrong?

No Eric, you're not wrong.  I was simply suggesting it as an example of a
client which batches up commands to execute over an svn_ra_session_t (thought
it might provide some inspiration).

That's a good clarification, though.  Matt, while that particular piece of
code does perform several different types of RA calls, it's primarily batching
all user-visible functions into a single RA call.

Regardless, the approach you were considering is similar.
-- 

Daniel Rall

Re: Speed up consecutive operations

Posted by "Daniel L. Rall" <dl...@collab.net>.
On Mon, 22 Oct 2007, Matthias Junker wrote:
...
> that's what i'm wondering: how much could i speed things up by using the 
> RA layer instead the
> client layer. i thought the authentication uses a lot of time... but 
> then again, i'm not yet very familiar

Using the RA layer isn't going to avoid the need for auth.  For that, you'd
need to use the svn_repos.h (with NULL authz callbacks) or svn_fs.h APIs,
which for building a typical client, you wouldn't want to do.o

Using the RA layer simply allows you run several commands over the same
RA session, the savings of which are going to differ depending on the RA
layer you're using, and how each RA layer interacts with the network (e.g.
HTTP/S has a lot of overhead, SVN's custom protocol is a bit faster, local
access will offer practically no savings).  Basically, we're I/O bound, with
most CPU time spent calculating checksums, tracing history, etc.

Re: Speed up consecutive operations

Posted by Matthias Junker <ma...@students.unibe.ch>.
thanks, will take a look at it.

that's what i'm wondering: how much could i speed things up by using the 
RA layer instead the
client layer. i thought the authentication uses a lot of time... but 
then again, i'm not yet very familiar
with subversion.  any other opinions?

Matt
>> Matt, you might want to look at svnmucc, which is an alternate
>>     
> command-
>   
>> line
>> client which runs multiple commands at a time:
>> http://svn.collab.net/repos/svn/trunk/contrib/client-side/svnmucc/
>>
>> While using the RA APIs yourself will save some overhead, I'm
>>     
> uncertain
>   
>> that
>> it'll give you a noticable speed-up in terms of wall-clock time.
>>     
>
> I wasn't aware that mucc handled things like ls, diff, etc.  Just
> standard stuff you can do with a commit-editor.  Am I wrong?
>
> Eric
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Speed up consecutive operations

Posted by "Miller, Eric" <Er...@amd.com>.
> Matt, you might want to look at svnmucc, which is an alternate
command-
> line
> client which runs multiple commands at a time:
> http://svn.collab.net/repos/svn/trunk/contrib/client-side/svnmucc/
> 
> While using the RA APIs yourself will save some overhead, I'm
uncertain
> that
> it'll give you a noticable speed-up in terms of wall-clock time.

I wasn't aware that mucc handled things like ls, diff, etc.  Just
standard stuff you can do with a commit-editor.  Am I wrong?

Eric


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: Speed up consecutive operations

Posted by "Daniel L. Rall" <dl...@collab.net>.
On Mon, 22 Oct 2007, Matthias Junker wrote:

> Hello,
> 
> i'm working on a client for subversion, which builds a model of a 
> repository in smalltalk. for now i'm just accessing the client api. but 
> the problem is, that i need to call read only commands like ls,diff,etc 
> very often in a row. of course i use the same ctx and pool for all these 
> operations, but they are still very slow (like 1-2s per operation). now 
> my question: would i have to use the repository access layer directly to 
> speed up consecutive operations (by using the same session over and over 
> again?), or is there a way to do this using the client api? Or did i get 
> it all wrong? i would be really glad for any kind of hint, because i 
> just started developing with subversion and it's hard for me to estimate 
> the effort for this.

Matt, you might want to look at svnmucc, which is an alternate command-line
client which runs multiple commands at a time:
http://svn.collab.net/repos/svn/trunk/contrib/client-side/svnmucc/

While using the RA APIs yourself will save some overhead, I'm uncertain that
it'll give you a noticable speed-up in terms of wall-clock time.
-- 

Daniel Rall

Re: Speed up consecutive operations

Posted by Karl Fogel <kf...@red-bean.com>.
Matthias Junker <ma...@students.unibe.ch> writes:
> i'm working on a client for subversion, which builds a model of a
> repository in smalltalk. for now i'm just accessing the client
> api. but the problem is, that i need to call read only commands like
> ls,diff,etc very often in a row. of course i use the same ctx and pool
> for all these operations, but they are still very slow (like 1-2s per
> operation). now my question: would i have to use the repository access
> layer directly to speed up consecutive operations (by using the same
> session over and over again?), or is there a way to do this using the
> client api? Or did i get it all wrong? i would be really glad for any
> kind of hint, because i just started developing with subversion and
> it's hard for me to estimate the effort for this.

For future reference, this sort of question belongs on users@, not dev@.

   http://subversion.tigris.org/mailing-list-guidelines.html#where

No big deal, just letting you know for next time.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org