You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2011/09/14 17:07:19 UTC

[PATCH] Speed up client by re-using RA session connections

A patch in progress, for demonstration and your feedback.

Since long ago we've thought about letting the client, through the
libsvn_client API, share a single RA connection across a series of
operations -- e.g.

  "svn update a b c" currently opens 3 RA sesssions.

Similarly, inside libsvn_client we often open up an extra RA session
when we could have re-used an old one -- e.g.

  Update with externals -- opens a new session per external [1]

  "svn mergeinfo" -- opens two or three sessions

The attached patch implements "caching" of connections that have been
used and may be used again.  The cache is initialized (to empty) by the
caller (the client executable) and connections in it are established or
re-used by libsvn_client as required.

This patch demonstrates the usage inside "svn status" and "svn info"
subcommands, and also (partially) inside svn_client_mergeinfo_log()
which is the guts of the "svn mergeinfo" command.

In this patch, my idea is that the client passes the cache explicitly to
each API (or passes NULL to not use a cache), so that the client retains
control of the lifetime of these connections.  The "svn" client, for
example, may not worry about ever clearing the cache whereas a
long-running GUI may clear it after a certain period of time, and/or
limit the number of entries in it.

Thoughts?


In the email threads referenced below, we discussed some related issues
such as whether it would be better to use root-relative paths all the
time instead of "parenting" a session at a specific URL, and how this
ties in with authenticating for access to a specific path.  But I don't
think those issues need to be addressed before adding the basic ability
to re-use a session at all.

Prior discussions:

[1] Email thread "Re-use connection for svn:externals", from Phillip
Hellewell on 2010-02-08,
<http://svn.haxx.se/dev/archive-2010-02/0168.shtml>.

[2] Email thread "[RFC] Concept of RA session relative paths", from Ivan
Zhakov on 2009-10-13,
<http://svn.haxx.se/dev/archive-2009-10/0334.shtml>.

- Julian

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On Wed, Sep 14, 2011 at 19:29, Daniel Shahaf <da...@elego.de> wrote:
>>   - Implement a private API for libsvn_client functions to use instead of
>>     simply opening a new session. It re-uses a session from the cache, if
>>     present, or opens a new connection if necessary.
>
> As far as I can see, the "get a session from the cache" code assumes
> that the cache is used by a single thread.  (Consider what happens if
> one thread retrieves or uses a cached session while another retrieves
> the same session from the cache.)  Will the cache be thread-safe?
>
We can create differnet RA session cache for each
svn_client_context_t instance, which is cannot be shared between
threads.

-- 
Ivan Zhakov

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Stefan Küng <to...@gmail.com>.

On 14.09.2011 22:22, C. Michael Pilato wrote:
> On 09/14/2011 04:13 PM, Stefan Küng wrote:
>> On 14.09.2011 22:01, C. Michael Pilato wrote:
>>> On 09/14/2011 02:04 PM, Julian Foad wrote:
>>>> In the "status" API, in this patch, it appears that two separate
>>>> connections are wanted at the same time.  That may not be necessary
>>>> there, and if the code can be rearranged to do the "get locks" before or
>>>> after the main work on a single connection then great, that's an
>>>> independent improvement; but for now let's ignore that possibility and
>>>> assume that in some cases like this we do need two parallel sessions.
>>>> The way I've handled this is to make the *caller* pass in two separate
>>>> caches, but that's ugly, exposing an implementation detail.  A better
>>>> way would be for the code to "check out" two separate sessions from the
>>>> same cache.
>>>
>>> Ugh.  No svn_client.h public function should need to an ra_session_t or
>>> cache thereof.  Client operations at this level are designed to be
>>> self-contained enough to not require that every single individual client
>>> program manage such a low-level concept as a RA session.  (And if there are
>>> API functions which fail that design tenet, we should try to understand why
>>> and remedy that problem directly.)
>>
>> I'd like to point out that in TSVN, we use svn_client_open_ra_session() to
>> get an ra_session_t in some places. We use it mostly to call
>> svn_ra_get_latest_revnum() and svn_ra_get_repos_root2() in the same session.
>> And in two other places we call svn_ra_get_uuid2() and svn_ra_get_locks2()
>> (not in the same session).
>
> My point wasn't to say "clients should be disallowed from using the RA layer
> directly".  Only that, by and large, the major version control client
> operations should speak in terms of paths, URLs, and revisions -- not
> low-level RA sessions.
>
> Thinking of your use-case specifically, Stefan, would you benefit from a new
> a client API function -- svn_client_get_repos_info() -- which fetches the
> repository root URL, repository UUID, and (optionally) the current youngest
> revision all in one fell swoop?

Yes, that would of course help a lot!
And maybe svn_client_get_locks() which wraps svn_ra_get_locks2(). Then 
we could get rid of using svn_ra_* completely.

Stefan

-- 
        ___
   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.net

Re: [PATCH] Speed up client by re-using RA session connections

Posted by "C. Michael Pilato" <cm...@collab.net>.

On 09/14/2011 04:13 PM, Stefan Küng wrote:
> On 14.09.2011 22:01, C. Michael Pilato wrote:
>> On 09/14/2011 02:04 PM, Julian Foad wrote:
>>> In the "status" API, in this patch, it appears that two separate
>>> connections are wanted at the same time.  That may not be necessary
>>> there, and if the code can be rearranged to do the "get locks" before or
>>> after the main work on a single connection then great, that's an
>>> independent improvement; but for now let's ignore that possibility and
>>> assume that in some cases like this we do need two parallel sessions.
>>> The way I've handled this is to make the *caller* pass in two separate
>>> caches, but that's ugly, exposing an implementation detail.  A better
>>> way would be for the code to "check out" two separate sessions from the
>>> same cache.
>>
>> Ugh.  No svn_client.h public function should need to an ra_session_t or
>> cache thereof.  Client operations at this level are designed to be
>> self-contained enough to not require that every single individual client
>> program manage such a low-level concept as a RA session.  (And if there are
>> API functions which fail that design tenet, we should try to understand why
>> and remedy that problem directly.)
> 
> I'd like to point out that in TSVN, we use svn_client_open_ra_session() to
> get an ra_session_t in some places. We use it mostly to call
> svn_ra_get_latest_revnum() and svn_ra_get_repos_root2() in the same session.
> And in two other places we call svn_ra_get_uuid2() and svn_ra_get_locks2()
> (not in the same session).

My point wasn't to say "clients should be disallowed from using the RA layer
directly".  Only that, by and large, the major version control client
operations should speak in terms of paths, URLs, and revisions -- not
low-level RA sessions.

Thinking of your use-case specifically, Stefan, would you benefit from a new
a client API function -- svn_client_get_repos_info() -- which fetches the
repository root URL, repository UUID, and (optionally) the current youngest
revision all in one fell swoop?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Stefan Küng <to...@gmail.com>.

On 14.09.2011 22:01, C. Michael Pilato wrote:
> On 09/14/2011 02:04 PM, Julian Foad wrote:
>> In the "status" API, in this patch, it appears that two separate
>> connections are wanted at the same time.  That may not be necessary
>> there, and if the code can be rearranged to do the "get locks" before or
>> after the main work on a single connection then great, that's an
>> independent improvement; but for now let's ignore that possibility and
>> assume that in some cases like this we do need two parallel sessions.
>> The way I've handled this is to make the *caller* pass in two separate
>> caches, but that's ugly, exposing an implementation detail.  A better
>> way would be for the code to "check out" two separate sessions from the
>> same cache.
>
> Ugh.  No svn_client.h public function should need to an ra_session_t or
> cache thereof.  Client operations at this level are designed to be
> self-contained enough to not require that every single individual client
> program manage such a low-level concept as a RA session.  (And if there are
> API functions which fail that design tenet, we should try to understand why
> and remedy that problem directly.)

I'd like to point out that in TSVN, we use svn_client_open_ra_session() 
to get an ra_session_t in some places. We use it mostly to call 
svn_ra_get_latest_revnum() and svn_ra_get_repos_root2() in the same 
session. And in two other places we call svn_ra_get_uuid2() and 
svn_ra_get_locks2() (not in the same session).

Stefan

-- 
        ___
   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.net

Re: [PATCH] Speed up client by re-using RA session connections

Posted by "C. Michael Pilato" <cm...@collab.net>.

On 09/14/2011 02:04 PM, Julian Foad wrote:
> In the "status" API, in this patch, it appears that two separate
> connections are wanted at the same time.  That may not be necessary
> there, and if the code can be rearranged to do the "get locks" before or
> after the main work on a single connection then great, that's an
> independent improvement; but for now let's ignore that possibility and
> assume that in some cases like this we do need two parallel sessions.
> The way I've handled this is to make the *caller* pass in two separate
> caches, but that's ugly, exposing an implementation detail.  A better
> way would be for the code to "check out" two separate sessions from the
> same cache.

Ugh.  No svn_client.h public function should need to an ra_session_t or
cache thereof.  Client operations at this level are designed to be
self-contained enough to not require that every single individual client
program manage such a low-level concept as a RA session.  (And if there are
API functions which fail that design tenet, we should try to understand why
and remedy that problem directly.)

Have you considered storing your cache in the svn_client_context_t baton?
Does that allow the behavior you want without asking callers to do
svn_ra-ish stuff?

> Another thought: should part (or all) of the support for this kind of
> cache be provided by the svn_ra API rather than libsvn_client?  Maybe
> with libsvn_client wrapping it in a libsvn_client presentation API.

I suppose this logic could live in the RA layer.

I haven't read the patch, but does it take into account that today, RA
sessions opened by client functions have differing sets of callback
functions implemented depending on how
svn_client__open_ra_session_internal() was called?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Julian Foad <ju...@wandisco.com>.

Improvements already.

This current design is simplistic in that the "end of usage" of a given
session is implicit.  It would be better if (inspired by Hyrum's comment
on IRC) we would "check out" a session from the cache and then "return"
that session to the cache when we've finished using it.  That design
could then be extended to be thread-safe later, allowing a single cache
to be shared by multiple threads.  The svn_client__cached_ra_session()
function would be made thread-safe and would allow a given cached
session to be "checked out" by only one thread at a time.  It would open
a new connection to the same repo if there was no currently *available*
connection to that repo.

That "sharable" cache would be better for parallelizing clients.

In the "status" API, in this patch, it appears that two separate
connections are wanted at the same time.  That may not be necessary
there, and if the code can be rearranged to do the "get locks" before or
after the main work on a single connection then great, that's an
independent improvement; but for now let's ignore that possibility and
assume that in some cases like this we do need two parallel sessions.
The way I've handled this is to make the *caller* pass in two separate
caches, but that's ugly, exposing an implementation detail.  A better
way would be for the code to "check out" two separate sessions from the
same cache.

Another thought: should part (or all) of the support for this kind of
cache be provided by the svn_ra API rather than libsvn_client?  Maybe
with libsvn_client wrapping it in a libsvn_client presentation API.

- Julian

On Wed, 2011-09-14 at 17:59 +0100, Julian Foad wrote:
> Daniel Shahaf wrote:
> > Julian Foad wrote on Wed, Sep 14, 2011 at 16:07:19 +0100:
> > > Enable libsvn_client APIs to re-use a previous RA session instead of always
> > > opening a new connection.
> > > 
> > > Basically:
> > > 
> > >   - Declare an opaque "RA session cache" object in the public API. It holds
> > >     one open RA session per repository, for any number of repositories.
> >  
> > Why per repository?
> 
> Because the caller of a libsvn_client API generally provides a
> "path_or_url" without knowing which repository it refers to.  If the
> client is calling e.g. "update a/", "update b/", "update c/", it doesn't
> know in advance whether those paths refer to different repositories.
> Suppose only "a" and "c" are in the same repo; the client can't pass
> just a single cached session if we want to re-use the "a" session for
> "c".
> 
> >   What about a client that runs 'svn update' on several
> > wc's of the same URL concurrently?
> 
> I haven't attempted to support concurrent calls.  The cache is for
> sequential use only (one user of the cache at a time).  If a client is
> doing parallel updates and want to use this kind of cache it must have a
> separate cache per thread.  I think.
> 
> > >   - Implement a private API for libsvn_client functions to use instead of
> > >     simply opening a new session. It re-uses a session from the cache, if
> > >     present, or opens a new connection if necessary.
> > 
> > As far as I can see, the "get a session from the cache" code assumes
> > that the cache is used by a single thread.  (Consider what happens if
> > one thread retrieves or uses a cached session while another retrieves
> > the same session from the cache.)  Will the cache be thread-safe?
> 
> See above.
> 
> - Julian
> 
>

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Julian Foad <ju...@wandisco.com>.

Daniel Shahaf wrote:
> Julian Foad wrote on Wed, Sep 14, 2011 at 16:07:19 +0100:
> > Enable libsvn_client APIs to re-use a previous RA session instead of always
> > opening a new connection.
> > 
> > Basically:
> > 
> >   - Declare an opaque "RA session cache" object in the public API. It holds
> >     one open RA session per repository, for any number of repositories.
>  
> Why per repository?

Because the caller of a libsvn_client API generally provides a
"path_or_url" without knowing which repository it refers to.  If the
client is calling e.g. "update a/", "update b/", "update c/", it doesn't
know in advance whether those paths refer to different repositories.
Suppose only "a" and "c" are in the same repo; the client can't pass
just a single cached session if we want to re-use the "a" session for
"c".

>   What about a client that runs 'svn update' on several
> wc's of the same URL concurrently?

I haven't attempted to support concurrent calls.  The cache is for
sequential use only (one user of the cache at a time).  If a client is
doing parallel updates and want to use this kind of cache it must have a
separate cache per thread.  I think.

> >   - Implement a private API for libsvn_client functions to use instead of
> >     simply opening a new session. It re-uses a session from the cache, if
> >     present, or opens a new connection if necessary.
> 
> As far as I can see, the "get a session from the cache" code assumes
> that the cache is used by a single thread.  (Consider what happens if
> one thread retrieves or uses a cached session while another retrieves
> the same session from the cache.)  Will the cache be thread-safe?

See above.

- Julian

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Daniel Shahaf <da...@elego.de>.

Julian Foad wrote on Wed, Sep 14, 2011 at 16:07:19 +0100:
> Enable libsvn_client APIs to re-use a previous RA session instead of always
> opening a new connection.
> 
> Basically:
> 
>   - Declare an opaque "RA session cache" object in the public API. It holds
>     one open RA session per repository, for any number of repositories.
> 

Why per repository?  What about a client that runs 'svn update' on several
wc's of the same URL concurrently?

>   - Implement a private API for libsvn_client functions to use instead of
>     simply opening a new session. It re-uses a session from the cache, if
>     present, or opens a new connection if necessary.

As far as I can see, the "get a session from the cache" code assumes
that the cache is used by a single thread.  (Consider what happens if
one thread retrieves or uses a cached session while another retrieves
the same session from the cache.)  Will the cache be thread-safe?

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On Thu, Sep 15, 2011 at 16:05, Julian Foad <ju...@wandisco.com> wrote:
> Thanks for the feedback, everyone.  Looks like the principle of re-using
> connections is sound but this implementation is way wrong :-)
>
> I'm not working much on this right now and will come back to it later,
> but here's a dump of my current brain state.
>
>
[...]

>
>
Hi Julian,

Thanks for great discussion summary.

I still think that we need to implement explicit call function, to use
it in functions like svn_client__get_repos_root():
[[[
  svn_ra_session_t *ra_session;

  /* If PATH_OR_URL is a local path we can fetch the repos root locally. */
  if (!svn_path_is_url(abspath_or_url))
    {
      SVN_ERR(svn_wc__node_get_repos_info(repos_root, NULL,
                                          ctx->wc_ctx, abspath_or_url,
                                          result_pool, scratch_pool));

      return SVN_NO_ERROR;
    }

  /* If PATH_OR_URL was a URL, we use the RA layer to look it up. */
  SVN_ERR(svn_client__open_ra_session_internal(&ra_session, NULL,
                                               abspath_or_url,
                                               NULL, NULL, FALSE, TRUE,
                                               ctx, scratch_pool));

  SVN_ERR(svn_ra_get_repos_root2(ra_session, repos_root, result_pool));
]]]

And callers use  pattern like this:
[[[
  /* Determine the working copy target's repository root URL. */
  SVN_ERR(svn_client__get_repos_root(&wc_repos_root, target_abspath,
                                     ctx, scratch_pool, scratch_pool));

  /* Determine the source's repository root URL. */
  SVN_ERR(svn_client__get_repos_root(&source_repos_root, url2,
                                     ctx, scratch_pool, scratch_pool));
]]

Creating subpools for each call to such function is not inconvenient,
but I agree that we can add explicit close any time.

-- 
Ivan Zhakov

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Julian Foad <ju...@wandisco.com>.

Thanks for the feedback, everyone.  Looks like the principle of re-using
connections is sound but this implementation is way wrong :-)

I'm not working much on this right now and will come back to it later,
but here's a dump of my current brain state.


Putting the sessions cache in svn_client_context_t
--------------------------------------------------

Yes, that's better.  I had thought of that initially, but then I decided
it would not give a long-lived client enough control of the lifetime of
the sessions.  The client would be able to clear the cache from time to
time, of course, but it would not be able to use one set of sessions for
one sub-task and another set of sessions for another sub-task.  But I
now suppose there are better ways to manage that issue -- perhaps using
one svn_client_context_t per sub-task, or providing and using
libsvn_client public APIs for controlling the cache.


Implementation at the libsvn_ra layer
-------------------------------------

Implementing the "get a cached session" logic in libsvn_client is
problematic.  In that first patch, in libsvn_client I implemented
replacement functions for a couple of high-level "open an RA session"
private APIs.  But, as C-Mike pointed out, "RA sessions opened by client
functions have differing sets of callback functions implemented
depending on how svn_client__open_ra_session_internal() was called".  We
don't always want to retrieve the exact same svn_ra_session_t that was
being used before, we want a different one that connects to the same
repository.

So this intervention would work better right at the bottom of
libsvn_client, where we call svn_ra_open4() inside
svn_client__open_ra_session_internal().

This requires support from libsvn_ra -- the ability to "re-initialize" a
given session object with a new set of callbacks and other config data.
Or maybe the functionality should be described as "taking over" a given
RA session for a new connection to the same repository.

So my current thinking is we need a new RA function alongside
svn_ra_open4(), that modifies the given session object (or logically
"closes" the old session object and returns a new one) to represent a
new session without going through a re-connection round-trip.


Pool clean-up vs. an explicit "release" function
------------------------------------------------

Having pool clean-up be the primary release mechanism is fine and good.
It ensures the release will happen at some point even if a function
exits with an error.  Already, each caller that wants to explicitly
close the session is allocating it in a dedicated subpool and then
destroying that subpool.  If we now want libsvn_client to perform
different clean-up actions such as returning the session to the cache
instead of closing it, the general solution is for libsvn_client's "open
RA session" function to register its clean-up function on the
caller-provided pool, but when it opens a new RA session allocate that
in a "cache" pool that has a long lifetime.  (Or in a subpool of the
cache pool, if we want to be able to close the cached connections
independently.)

We can always implement a "close" function as well, if we think that is
a useful option for some callers.  That can be implemented on top of the
pool clean-up method of closing, at the libsvn_ra level and/or the
libsvn_client level.  That would be an independent enhancement.


Clearing the cache; controlling the life-time of the cached sessions
--------------------------------------------------------------------

There should be a libsvn_client public API to

  - close down all the cached sessions in the svn_client_ctx_t

and probably also to

  - close down a cached session to a specific repository


- Julian

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Hyrum K Wright wrote on Wed, Sep 14, 2011 at 16:13:41 -0500:
> On Wed, Sep 14, 2011 at 4:00 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> > On Wed, Sep 14, 2011 at 19:07, Julian Foad <ju...@wandisco.com> wrote:
> >> A patch in progress, for demonstration and your feedback.
> >>
> >> Since long ago we've thought about letting the client, through the
> >> libsvn_client API, share a single RA connection across a series of
> >> operations -- e.g.
> >>
> >>  "svn update a b c" currently opens 3 RA sesssions.
> >>
> >> Similarly, inside libsvn_client we often open up an extra RA session
> >> when we could have re-used an old one -- e.g.
> >>
> >>  Update with externals -- opens a new session per external [1]
> >>
> >>  "svn mergeinfo" -- opens two or three sessions
> >>
> >> The attached patch implements "caching" of connections that have been
> >> used and may be used again.  The cache is initialized (to empty) by the
> >> caller (the client executable) and connections in it are established or
> >> re-used by libsvn_client as required.
> >>
> > Hi Julian,
> >
> > My original idea was to make ra session pool (cache) to be part of
> > svn_client_context_t, so callers of svn_client_* API do not have to be
> > modified. And pattern usage of this pool would by something like:
> >
> > svn_ra_session_t * ra = svn_client__ra_pool_get_session(ctx, pool)
> > [perform some operations with RA session]
> >
> > then call
> >
> > svn_client_ra_pool_release_session(ctx, ra);
> >
> > (RA session will be automatically released back to session pool
> > (cache) when pool used in svn_client__ra_pool_get_session() is
> > cleared)
> >
> > What do you think?
> 
> I'll note that we may not even need an explicit release function, as
> the current ra session infrastructure simply installs a pool cleanup
> hook to do proper shutdown.  We could modify this hook to release the
> ra session back to the cache,

Do those pool cleanup handlers live in libsvn_client?  If not, how could
they release the session to the cache?

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On Thu, Sep 15, 2011 at 01:13, Hyrum K Wright <hy...@wandisco.com> wrote:
> On Wed, Sep 14, 2011 at 4:00 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> On Wed, Sep 14, 2011 at 19:07, Julian Foad <ju...@wandisco.com> wrote:
>>> A patch in progress, for demonstration and your feedback.
>>>
>>> Since long ago we've thought about letting the client, through the
>>> libsvn_client API, share a single RA connection across a series of
>>> operations -- e.g.
>>>
>>>  "svn update a b c" currently opens 3 RA sesssions.
>>>
>>> Similarly, inside libsvn_client we often open up an extra RA session
>>> when we could have re-used an old one -- e.g.
>>>
>>>  Update with externals -- opens a new session per external [1]
>>>
>>>  "svn mergeinfo" -- opens two or three sessions
>>>
>>> The attached patch implements "caching" of connections that have been
>>> used and may be used again.  The cache is initialized (to empty) by the
>>> caller (the client executable) and connections in it are established or
>>> re-used by libsvn_client as required.
>>>
>> Hi Julian,
>>
>> My original idea was to make ra session pool (cache) to be part of
>> svn_client_context_t, so callers of svn_client_* API do not have to be
>> modified. And pattern usage of this pool would by something like:
>>
>> svn_ra_session_t * ra = svn_client__ra_pool_get_session(ctx, pool)
>> [perform some operations with RA session]
>>
>> then call
>>
>> svn_client_ra_pool_release_session(ctx, ra);
>>
>> (RA session will be automatically released back to session pool
>> (cache) when pool used in svn_client__ra_pool_get_session() is
>> cleared)
>>
>> What do you think?
>
> I'll note that we may not even need an explicit release function, as
> the current ra session infrastructure simply installs a pool cleanup
> hook to do proper shutdown.  We could modify this hook to release the
> ra session back to the cache, rather than cleaning up the session (and
> in turn, the pool in which the cache is allocated could cleanup the
> cache and its contents).
>
> In all this we need to remember long-lived clients, and be sure we
> give them a way to kill the cache when it suites them.
>
It still worth to have explicit release function for cached RA
sessions, like we have for files and streams. Even they're
automatically closed on pool cleanup we often need just to say 'please
close this file, I'm finished working with it".

-- 
Ivan Zhakov

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Hyrum K Wright <hy...@wandisco.com>.

On Wed, Sep 14, 2011 at 4:00 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> On Wed, Sep 14, 2011 at 19:07, Julian Foad <ju...@wandisco.com> wrote:
>> A patch in progress, for demonstration and your feedback.
>>
>> Since long ago we've thought about letting the client, through the
>> libsvn_client API, share a single RA connection across a series of
>> operations -- e.g.
>>
>>  "svn update a b c" currently opens 3 RA sesssions.
>>
>> Similarly, inside libsvn_client we often open up an extra RA session
>> when we could have re-used an old one -- e.g.
>>
>>  Update with externals -- opens a new session per external [1]
>>
>>  "svn mergeinfo" -- opens two or three sessions
>>
>> The attached patch implements "caching" of connections that have been
>> used and may be used again.  The cache is initialized (to empty) by the
>> caller (the client executable) and connections in it are established or
>> re-used by libsvn_client as required.
>>
> Hi Julian,
>
> My original idea was to make ra session pool (cache) to be part of
> svn_client_context_t, so callers of svn_client_* API do not have to be
> modified. And pattern usage of this pool would by something like:
>
> svn_ra_session_t * ra = svn_client__ra_pool_get_session(ctx, pool)
> [perform some operations with RA session]
>
> then call
>
> svn_client_ra_pool_release_session(ctx, ra);
>
> (RA session will be automatically released back to session pool
> (cache) when pool used in svn_client__ra_pool_get_session() is
> cleared)
>
> What do you think?

I'll note that we may not even need an explicit release function, as
the current ra session infrastructure simply installs a pool cleanup
hook to do proper shutdown.  We could modify this hook to release the
ra session back to the cache, rather than cleaning up the session (and
in turn, the pool in which the cache is allocated could cleanup the
cache and its contents).

In all this we need to remember long-lived clients, and be sure we
give them a way to kill the cache when it suites them.

-Hyrum


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Re: [PATCH] Speed up client by re-using RA session connections

Posted by Ivan Zhakov <iv...@visualsvn.com>.

On Wed, Sep 14, 2011 at 19:07, Julian Foad <ju...@wandisco.com> wrote:
> A patch in progress, for demonstration and your feedback.
>
> Since long ago we've thought about letting the client, through the
> libsvn_client API, share a single RA connection across a series of
> operations -- e.g.
>
>  "svn update a b c" currently opens 3 RA sesssions.
>
> Similarly, inside libsvn_client we often open up an extra RA session
> when we could have re-used an old one -- e.g.
>
>  Update with externals -- opens a new session per external [1]
>
>  "svn mergeinfo" -- opens two or three sessions
>
> The attached patch implements "caching" of connections that have been
> used and may be used again.  The cache is initialized (to empty) by the
> caller (the client executable) and connections in it are established or
> re-used by libsvn_client as required.
>
Hi Julian,

My original idea was to make ra session pool (cache) to be part of
svn_client_context_t, so callers of svn_client_* API do not have to be
modified. And pattern usage of this pool would by something like:

svn_ra_session_t * ra = svn_client__ra_pool_get_session(ctx, pool)
[perform some operations with RA session]

then call

svn_client_ra_pool_release_session(ctx, ra);

(RA session will be automatically released back to session pool
(cache) when pool used in svn_client__ra_pool_get_session() is
cleared)

What do you think?



-- 
Ivan Zhakov