You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ed Hillmann <ed...@yahoo.com> on 2006/08/30 22:29:27 UTC

Status of new working copy format

I've been contributing the the Subversion module for Netbeans.  Specifically, I've added a Working Copy parser, so the module can access status and info details for a file by directly parsing the working copy.

I've noticed that SVN 1.4 has changed the format of the working copy.  So I've updated the parser to handle both formats (using the 1.4 rc 4 release).

I wanted to ask if I should expect more tweaking of the new working copy (I just noticed that rc 5 just came out).  I'd love to have this committed, so that as soon as SVN 1.4 comes out, we can cope with either working copy format.

However, if there are more changes expected, I'll make sure I go back to re-rest with the latest iteration of the working copy format.

Thanks for any info,

Ed


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Barry Scott <ba...@barrys-emacs.org>.
On Sep 7, 2006, at 01:15, Daniel Berlin wrote:

> Yeah, proplist sucks.
> It keeps reopening and rereading the entries file again and again.

Is that fixable within the current API design or does it need a API
to return all the props for all the files in one dir?

This is a operation that I use in the WorkBench GUI as often
as I call svn_client_status so that I can update the
view of the current directory. As such it has to be fast to prevent
the user thinking that the GUI is responsive. I expect other
subversion GUI's to benefit from improving prop access.

Barry


> On 9/6/06, Barry Scott <ba...@barrys-emacs.org> wrote:
>>
>> On Sep 4, 2006, at 02:38, Daniel Berlin wrote:
>>
>> >>
>> >> 1.4 is 3 times faster then 1.3.2
>> >>
>> >> python code is 70 times faster the 1.3.2
>> >> and still 21 times faster then 1.4.0
>> >>
>> >> I would guess that you cannot close the gap until you have an API
>> >> that
>> >> allows one call into svn client lib to get all the proplist for 1
>> >> directory.
>> >
>> > Sure.
>> > Though i'd still love to seen a ktrace of your code.
>>
>> I tested svn speed by using the svn command line
>> thus:
>>
>> $ svn co URL dir
>> $ time svn proplist dir/*.xxx >/dev/null
>>
>> I used a URL into one of my repos that has 157 files with props for
>> the test.
>> To minimize the effect of caching I checkout before running the
>> proplist test.
>> Repeating the svn proplist will see a speed up because of caching. In
>> a GUI
>> the problem is often the time taken the first time you hit a  
>> directory.
>>
>> My python code is can be run from the attached file.
>>
>>
>>
>> It can be run like this:
>>
>> $ python prop_speed_test.py fast dir/*.xxx
>>
>> Barry
>>
>>
>>
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Daniel Berlin <db...@dberlin.org>.
Yeah, proplist sucks.
It keeps reopening and rereading the entries file again and again.

On 9/6/06, Barry Scott <ba...@barrys-emacs.org> wrote:
>
> On Sep 4, 2006, at 02:38, Daniel Berlin wrote:
>
> >>
> >> 1.4 is 3 times faster then 1.3.2
> >>
> >> python code is 70 times faster the 1.3.2
> >> and still 21 times faster then 1.4.0
> >>
> >> I would guess that you cannot close the gap until you have an API
> >> that
> >> allows one call into svn client lib to get all the proplist for 1
> >> directory.
> >
> > Sure.
> > Though i'd still love to seen a ktrace of your code.
>
> I tested svn speed by using the svn command line
> thus:
>
> $ svn co URL dir
> $ time svn proplist dir/*.xxx >/dev/null
>
> I used a URL into one of my repos that has 157 files with props for
> the test.
> To minimize the effect of caching I checkout before running the
> proplist test.
> Repeating the svn proplist will see a speed up because of caching. In
> a GUI
> the problem is often the time taken the first time you hit a directory.
>
> My python code is can be run from the attached file.
>
>
>
> It can be run like this:
>
> $ python prop_speed_test.py fast dir/*.xxx
>
> Barry
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Barry Scott <ba...@barrys-emacs.org>.
On Sep 4, 2006, at 02:38, Daniel Berlin wrote:

>>
>> 1.4 is 3 times faster then 1.3.2
>>
>> python code is 70 times faster the 1.3.2
>> and still 21 times faster then 1.4.0
>>
>> I would guess that you cannot close the gap until you have an API  
>> that
>> allows one call into svn client lib to get all the proplist for 1
>> directory.
>
> Sure.
> Though i'd still love to seen a ktrace of your code.

I tested svn speed by using the svn command line
thus:

$ svn co URL dir
$ time svn proplist dir/*.xxx >/dev/null

I used a URL into one of my repos that has 157 files with props for  
the test.
To minimize the effect of caching I checkout before running the  
proplist test.
Repeating the svn proplist will see a speed up because of caching. In  
a GUI
the problem is often the time taken the first time you hit a directory.

My python code is can be run from the attached file.

Re: Status of new working copy format

Posted by Daniel Berlin <db...@dberlin.org>.
> 
> 1.4 is 3 times faster then 1.3.2
>
> python code is 70 times faster the 1.3.2
> and still 21 times faster then 1.4.0
>
> I would guess that you cannot close the gap until you have an API that
> allows one call into svn client lib to get all the proplist for 1
> directory.

Sure.
Though i'd still love to seen a ktrace of your code.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Erik Huelsmann <eh...@gmail.com>.
On 9/3/06, Barry Scott <ba...@barrys-emacs.org> wrote:
>
> On Sep 3, 2006, at 00:56, Daniel Berlin wrote:
>
> >>
> >> Once SVN has a fast API to get at properties I'd love to use it.
> >> The key here is that GUIs want info on all files in a directory.
> >> svn_client_list and svn_client_status are the only two commands
> >> that will do this. All others require one call for each files that
> >> that
> >> is very slow (because of wc locking?).
> >>
> > It used to stat and play around with a lot of files.  It also used
> > single file stats in places that it could have used readdir.
> >
> > 1.4 should be a *lot* better about this (i did a lot of strace and
> > extraneous stat/etc removal).
> >
> > Not to mention files without properties no longer have an empty file
> > associated with them.
>
> Comparing the time to get all the props of 157 files I see the following
> performance. I report two times for each tool. The first is the time
> taken
> immediately after a checkout. The second time is taken immediately
> after the first test run and will benefit from the OS cache. I'm using
> Mac OS X as the test system.
>
> 1.3.2 svn proplist - 1.613s 1.524s
> 1.4.0rc3 svn proplist - 0.504s 0.495s
> python proplist - 0.023s 0.022s
>
> 1.4 is 3 times faster then 1.3.2
>
> python code is 70 times faster the 1.3.2
> and still 21 times faster then 1.4.0
>
> I would guess that you cannot close the gap until you have an API that
> allows one call into svn client lib to get all the proplist for 1
> directory.

Can we see the code you used to test the svn API? Are you re-opening
the working copy over and over? Or do you open it once and read the
props after that?

Thanks in advance!

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Barry Scott <ba...@barrys-emacs.org>.
On Sep 3, 2006, at 00:56, Daniel Berlin wrote:

>>
>> Once SVN has a fast API to get at properties I'd love to use it.
>> The key here is that GUIs want info on all files in a directory.
>> svn_client_list and svn_client_status are the only two commands
>> that will do this. All others require one call for each files that  
>> that
>> is very slow (because of wc locking?).
>>
> It used to stat and play around with a lot of files.  It also used
> single file stats in places that it could have used readdir.
>
> 1.4 should be a *lot* better about this (i did a lot of strace and
> extraneous stat/etc removal).
>
> Not to mention files without properties no longer have an empty file
> associated with them.

Comparing the time to get all the props of 157 files I see the following
performance. I report two times for each tool. The first is the time  
taken
immediately after a checkout. The second time is taken immediately
after the first test run and will benefit from the OS cache. I'm using
Mac OS X as the test system.

1.3.2 svn proplist - 1.613s 1.524s
1.4.0rc3 svn proplist - 0.504s 0.495s
python proplist - 0.023s 0.022s

1.4 is 3 times faster then 1.3.2

python code is 70 times faster the 1.3.2
and still 21 times faster then 1.4.0

I would guess that you cannot close the gap until you have an API that
allows one call into svn client lib to get all the proplist for 1  
directory.

Barry


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Daniel Berlin <db...@dberlin.org>.
On 9/2/06, Barry Scott <ba...@barrys-emacs.org> wrote:
>
> On Aug 31, 2006, at 02:26, Ben Collins-Sussman wrote:
>
> > On 8/30/06, Mark Phippard <ma...@softlanding.com> wrote:
> >
> >> It isn't a great idea to be doing your own parsing
> >
> > ...especially since we've already provided you an API to do it.  :-)
>
> A very very very slow API in the case of properties.
>
> In pysvn WorkBench I parse the working copy to get properties for
> all files in one directory. The python code that does this is atleast
> an order of magnitude faster then the SVN API.
>
> Once SVN has a fast API to get at properties I'd love to use it.
> The key here is that GUIs want info on all files in a directory.
> svn_client_list and svn_client_status are the only two commands
> that will do this. All others require one call for each files that that
> is very slow (because of wc locking?).
>
It used to stat and play around with a lot of files.  It also used
single file stats in places that it could have used readdir.

1.4 should be a *lot* better about this (i did a lot of strace and
extraneous stat/etc removal).

Not to mention files without properties no longer have an empty file
associated with them.

--Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Barry Scott <ba...@barrys-emacs.org>.
On Aug 31, 2006, at 02:26, Ben Collins-Sussman wrote:

> On 8/30/06, Mark Phippard <ma...@softlanding.com> wrote:
>
>> It isn't a great idea to be doing your own parsing
>
> ...especially since we've already provided you an API to do it.  :-)

A very very very slow API in the case of properties.

In pysvn WorkBench I parse the working copy to get properties for
all files in one directory. The python code that does this is atleast
an order of magnitude faster then the SVN API.

Once SVN has a fast API to get at properties I'd love to use it.
The key here is that GUIs want info on all files in a directory.
svn_client_list and svn_client_status are the only two commands
that will do this. All others require one call for each files that that
is very slow (because of wc locking?).

Barry


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Ed Hillmann <ed...@yahoo.com>.
There's already a flag to disable parsing the working copy directly, so I'm sure it could be expanded to specify to use JavaHL (where it's supported), or even to sort it out directly.  The JavaSVN isn't an option (for licensing reasons).  I just haven't had much time to investigate this alternative approach.  As I've just finished up the 1.4 parsing code, perhaps I'll look at making changes so it's never used. :)



----- Original Message ----
From: Daniel Rall <dl...@collab.net>
To: Ed Hillmann <ed...@yahoo.com>
Cc: Mark Phippard <ma...@softlanding.com>; Ben Collins-Sussman <su...@red-bean.com>; SubversionDevMailingList <de...@subversion.tigris.org>
Sent: Wednesday, September 6, 2006 3:55:27 AM
Subject: Re: Status of new working copy format

On Sun, 03 Sep 2006, Ed Hillmann wrote:

<snip>

svnClientAdapter can use the Subversion command-line client, JavaHL,
or JavaSVN -- it's not stuck to wrapping the command-line client
(though Netbeans might be?).

<and more snip>

Why not use svnClientAdapter with a fall-back approach which first
looks for JavaHL, then JavaSVN, then falls back to the command-line
adapter?




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Daniel Rall <dl...@collab.net>.
On Sun, 03 Sep 2006, Ed Hillmann wrote:

> No there's not.  They are using the svnClientAdaptor for the core functionality.  However, they found that for each file, it was creating a new process to call out to the svn client.  So, the NetBeans client was quickly running out of available processes for average size directories in Unix environments.  Plus the performance hit for creating each new process.

svnClientAdapter can use the Subversion command-line client, JavaHL,
or JavaSVN -- it's not stuck to wrapping the command-line client
(though Netbeans might be?).

> Other alternatives were looked at.  While TMate's Java implementation is open source, it's license wasn't going to work with NetBeans, as it's platform can be used as the framework for a commercial product.  Apparently, the TMate license only allowed Open Source licensing of the Java SVN layer if the product itself is open source.
> 
> And while JavaHL was optimal, it was seen as an installation issue for Unix environments.  For Windows environments (and for the most part Linux), the required DLL/.SO should be provided with the installations files.  But that couldn't be guaranteed for all the flavors of Unis they wanted to support out of the box.

Why not use svnClientAdapter with a fall-back approach which first
looks for JavaHL, then JavaSVN, then falls back to the command-line
adapter?

> So, in this case, for getting ISVNStatus and ISVNInfo data for a file, we parse the Working Copy itself.  This is read-only access, as mentioned earlier (all the other access is through the svn client itself via the adaptor).  We just have our own local objects which implement the SvnClientAdaptor interfaces (ISVNStatus and ISVNInfo) with the parsed data.  This reduced the number of concurrent processes being created dramatically.
> 
> I came into this discussion late, being the contributor who asked for something to do. :)  This is how it was explained to me.  They had the justification on their project site, but I couldn't find just now.  I was really hoping to use the JavaHL (as I was keen on using the API provided by Subversion), but was told that it wasn't an option.

Re: Status of new working copy format

Posted by Ed Hillmann <ed...@yahoo.com>.
No there's not.  They are using the svnClientAdaptor for the core functionality.  However, they found that for each file, it was creating a new process to call out to the svn client.  So, the NetBeans client was quickly running out of available processes for average size directories in Unix environments.  Plus the performance hit for creating each new process.

Other alternatives were looked at.  While TMate's Java implementation is open source, it's license wasn't going to work with NetBeans, as it's platform can be used as the framework for a commercial product.  Apparently, the TMate license only allowed Open Source licensing of the Java SVN layer if the product itself is open source.

And while JavaHL was optimal, it was seen as an installation issue for Unix environments.  For Windows environments (and for the most part Linux), the required DLL/.SO should be provided with the installations files.  But that couldn't be guaranteed for all the flavors of Unis they wanted to support out of the box.

So, in this case, for getting ISVNStatus and ISVNInfo data for a file, we parse the Working Copy itself.  This is read-only access, as mentioned earlier (all the other access is through the svn client itself via the adaptor).  We just have our own local objects which implement the SvnClientAdaptor interfaces (ISVNStatus and ISVNInfo) with the parsed data.  This reduced the number of concurrent processes being created dramatically.

I came into this discussion late, being the contributor who asked for something to do. :)  This is how it was explained to me.  They had the justification on their project site, but I couldn't find just now.  I was really hoping to use the JavaHL (as I was keen on using the API provided by Subversion), but was told that it wasn't an option.

Thanks,
Ed

----- Original Message ----
From: Daniel Rall <dl...@collab.net>
To: Ed Hillmann <ed...@yahoo.com>
Cc: Mark Phippard <ma...@softlanding.com>; Ben Collins-Sussman <su...@red-bean.com>; SubversionDevMailingList <de...@subversion.tigris.org>
Sent: Saturday, September 2, 2006 8:28:36 AM
Subject: Re: Status of new working copy format

On Wed, 30 Aug 2006, Ben Collins-Sussman wrote:

> On 8/30/06, Mark Phippard <ma...@softlanding.com> wrote:
> 
> >It isn't a great idea to be doing your own parsing
> 
> ...especially since we've already provided you an API to do it.  :-)

Ed, if you're using JavaHL directly, or Subclipse's svnClientAdapter
code, is there any information you need which can't get from those
Java APIs?



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Daniel Rall <dl...@collab.net>.
On Wed, 30 Aug 2006, Ben Collins-Sussman wrote:

> On 8/30/06, Mark Phippard <ma...@softlanding.com> wrote:
> 
> >It isn't a great idea to be doing your own parsing
> 
> ...especially since we've already provided you an API to do it.  :-)

Ed, if you're using JavaHL directly, or Subclipse's svnClientAdapter
code, is there any information you need which can't get from those
Java APIs?

Re: Status of new working copy format

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 8/30/06, Mark Phippard <ma...@softlanding.com> wrote:

> It isn't a great idea to be doing your own parsing

...especially since we've already provided you an API to do it.  :-)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status of new working copy format

Posted by Mark Phippard <ma...@softlanding.com>.
Ed Hillmann <ed...@yahoo.com> wrote on 08/30/2006 06:29:27 PM:

> I've been contributing the the Subversion module for Netbeans. 
> Specifically, I've added a Working Copy parser, so the module can access 

> status and info details for a file by directly parsing the working copy.
> 
> I've noticed that SVN 1.4 has changed the format of the working copy. So 

> I've updated the parser to handle both formats (using the 1.4 rc 4 
release).
> 
> I wanted to ask if I should expect more tweaking of the new working copy 
(I 
> just noticed that rc 5 just came out).  I'd love to have this committed, 
so 
> that as soon as SVN 1.4 comes out, we can cope with either working copy 
format.
> 
> However, if there are more changes expected, I'll make sure I go back to 
re-
> rest with the latest iteration of the working copy format.

It isn't a great idea to be doing your own parsing, but to answer your 
question, the format will stay the same for 1.4.  If it were to change 
now, pretty much impossible, the format number would have to be bumped.

There is a decent chance it will be bumped again in 1.5.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org