You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2012/07/25 00:50:24 UTC

Re: svn commit: r1365324 - reparenting an RA session each time it's used

Bert pointed out on IRC that changes like this one (and I have been making other similar changes for some time) could potentially have an adverse performance effect in some cases, because reparenting an RA session is not free.  Our IRC chat [1]:

Bert julianf: I hope you can get rid of those two extra ra sessions in your improvements (re: r1365182) 
julianf Bert: Sure. 
julianf Doing RA session rationalization right now, in fact, for other functions in the merge code, so that I can do it there. 
julianf Many merge functions were taking two different sessions just because they 
relied on the session's current 'parent' URL matching the URL to be 
operated on. By making all the functions not care where the session is 
parented, most functions can work with just one session. 
julianf (Only functions where the two sessions might be opened to two different 
repositories will still need two -- and there are not many such 
functions because we don't support diffing between two repos. If we ever do in the future, I'm confident the changes we'll need will not just be reverting this.) 
julianf There are some functions where the merge source and the merge target can be 
in different repos. But most of those functions don't take a session for the target anyway. 
Bert julianf: Reparenting an ra session is very cheap compared to creating a new ra 
session, but in our current ra implementations it might use network io 
(e.g. in ra_svn) and it uses a bit of memory in the session pool. So if 
we really need two locations in some place during merge it might help to keep using two throughout the merge code. 
julianf We discussed this a bit on the list some weeks or months back and it's 
more forward-thinking (towards a time when we 'cache' and re-use 
sessions more globally) and cleaner overall to re-use a single session. 
Bert (We should be able to resolve the leak for all reasonably modern servers, 
but very old ra_svn servers don't allow reparenting and we just reopen 
an ra session) 
julianf Another way of looking at it is it's moving towards the design model where a 
session isn't "parented" at an interesting URL but always at the root of the repo and callers always specify a root-relative path. 
julianf Also note that the low-level reparenting functions optimize out that case where the URL isn't actually changing. 
Bert *nod* But if you hop between two urls its always changing. 
Bert I'm not sure if we do, but by switching from 2 to 1 that might make a difference. 
julianf Well, the main big deal is that simplifying the merge code is far more 
significant than an extra round-trip or ten when talking to an old 
ra_svn server or any other minor time penalties. 
Bert For every ra server it is a few roundtrips (as reparenting is forwarded to 
the server). For very old servers its adding another connection (and if I remember correctly not even closing the old one). 
Bert s/For every ra server/For every ra_svn server/ (for ra_serf it is a local operation) 
Bert For those old versions it might even be typing your password again. 
julianf Reparenting is widespread in our code already. 
julianf If it's that big a problem, wouldn't we know about it? 
Bert julianf: Did we know about those slow sqlite queries that did a table scan? 
Bert I'm not sure if it is a big problem, but switching from 2 to 1 ra session 
might double the reparentings. And that *might* make a difference for 
some use-cases. 
Bert I'm not sure and I doubt anyone knows. 
julianf OK, please raise it on dev@. 
Bert In 1.6 we just opened 4-6 ra sessions in a common merge, probably more. 
julianf I'll try to find the thread where we last discussed it. 
Bert I just tried to add that a reparent is not a 'free' operation... It is 
still an expensive operation compared to many other operations... But 
compared to opening a new ra session it is still at least 10 times 
faster. 
Bert Just like every wc_db operation by itself is cheap, but adding all of them 
they still slow down some common operations that operate per node. 
julianf OK, I know that. Do you think we need to discuss on dev@ and try to find 
the scenarios where it might be slower than today and measure it? 
Bert And maybe during merge most urls do match in both sessions so they are optimized away in most cases. I haven't tested that. 
julianf Tell you what... I#ll email dev@. 

My gut feeling is this isn't going to be significant, but does anyone have any concern here or think we should try to figure out what the worst cases could be and test or measure them?
- Julian

[1] <http://colabti.org/irclogger/irclogger_log/svn-dev?date=2012-07-24#l180>.

julianfoad@apache.org wrote:

> * subversion/libsvn_client/merge.c
>   (find_unsynced_ranges): Don't care where the RA session is parented. This
>     will make it easy for the caller to use any session that is open to the
>     right repository.

> +      const char *old_session_url;
> +
> +      SVN_ERR(svn_client__ensure_ra_session_url(
> +                &old_session_url, ra_session, target_loc->url, scratch_pool));

>        [...] find_unsynced_ranges(const svn_client__p
>                        TRUE, /* discover_changed_paths */
>                        log_find_operative_revs, &log_baton,
>                        scratch_pool));
> +
> +      SVN_ERR(svn_ra_reparent(ra_session, old_session_url, scratch_pool));
>      }