You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Bert Huijben <rh...@sharpsvn.net> on 2009/02/07 13:36:12 UTC

[Client developers HEADS-UP] Penalty of APR_FILEPATH_TRUENAME (Was RE: svn commit: r35733 - trunk/subversion/bindings/javahl/native)

> Log:
> * subversion/bindings/javahl/native/JNIUtil.cpp
>   (JNIUtil::preprocessPath): Remove a gigantic Windows only performance
>     penalty by using the always correct svn_path_internal_style()
> instead
>     of the hidden breaking apr_filepath_merge with
> APR_FILEPATH_TRUENAME
>     that only did validation on Windows.

	Hi,

Until recently some parts of Subversion used apr_filepath_merge with the
APR_FILEPATH_TRUENAME flag all over the place. (It currently only uses it
for paths passed to svn via the commandline where it is actually usefull for
catching user typos).

Using this method is VERY VERY expensive on Windows. Our (and your) code
should canonicalize paths only once. Exactly when passed from an untrusted
source and after that assume that paths are canonical (or at least correctly
cased). 

(Apr ignores this flag on all OSs except Windows, so as Linux developer you
would never know.)


I would recommend that other subversion client developers review their code
to make sure they don't accidentally call this same method without realizing
that this is expensive and you can get a fILE back when you gave the method
a File. 


In 99.9% of all cases this normalization does nothing, except for costing a
lot of time; in that other 0.1 percent (where the file on disk has another
casing than subversion thinks it has), it makes it impossible for you to
call subversion commands on the administratively managed file, as it would
automatically 'repair' your 'File' to 'fILE'.

Thanks for your attention,

	Bert

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1119074

RE: [Client developers HEADS-UP] Penalty of APR_FILEPATH_TRUENAME (Was RE: svn commit: r35733 - trunk/subversion/bindings/javahl/native)

Posted by Bert Huijben <rh...@sharpsvn.net>.
> -----Original Message-----
> From: Mark Phippard [mailto:markphip@gmail.com]
> Sent: Saturday, February 07, 2009 5:31 PM
> To: Bert Huijben
> Cc: dev@subversion.tigris.org
> Subject: Re: [Client developers HEADS-UP] Penalty of
> APR_FILEPATH_TRUENAME (Was RE: svn commit: r35733 -
> trunk/subversion/bindings/javahl/native)
> 
> On Sat, Feb 7, 2009 at 8:36 AM, Bert Huijben <rh...@sharpsvn.net>
> wrote:
> >> Log:
> >> * subversion/bindings/javahl/native/JNIUtil.cpp
> >>   (JNIUtil::preprocessPath): Remove a gigantic Windows only
> performance
> >>     penalty by using the always correct svn_path_internal_style()
> >> instead
> >>     of the hidden breaking apr_filepath_merge with
> >> APR_FILEPATH_TRUENAME
> >>     that only did validation on Windows.
> >
> >        Hi,
> >
> > Until recently some parts of Subversion used apr_filepath_merge with
> the
> > APR_FILEPATH_TRUENAME flag all over the place. (It currently only
> uses it
> > for paths passed to svn via the commandline where it is actually
> usefull for
> > catching user typos).
> >
> > Using this method is VERY VERY expensive on Windows. Our (and your)
> code
> > should canonicalize paths only once. Exactly when passed from an
> untrusted
> > source and after that assume that paths are canonical (or at least
> correctly
> > cased).
> >
> > (Apr ignores this flag on all OSs except Windows, so as Linux
> developer you
> > would never know.)
> >
> >
> > I would recommend that other subversion client developers review
> their code
> > to make sure they don't accidentally call this same method without
> realizing
> > that this is expensive and you can get a fILE back when you gave the
> method
> > a File.
> >
> >
> > In 99.9% of all cases this normalization does nothing, except for
> costing a
> > lot of time; in that other 0.1 percent (where the file on disk has
> another
> > casing than subversion thinks it has), it makes it impossible for you
> to
> > call subversion commands on the administratively managed file, as it
> would
> > automatically 'repair' your 'File' to 'fILE'.
> >
> > Thanks for your attention,
> 
> I am in favor of the change.  I'll just have to keep my eyes open for
> whether this causes any problems for us in Subclipse on Windows.

If subclipse currently works fine on OS/X (I assume it does), it will
function the same on Windows.
(It just removes the platform difference)

> Did you do any kind of searching to get an idea where this function is
> used?  For example, it is not something that is exposed to us on the

It is used about everywhere where a path is passed from the Java side to the
C/C++ side, to normalize the paths at the C++ side. (Some newer code ignores
this function.. That code should probably be fixed to make sure the same
code path is followed)

> Java side, it is used internally in the C++ code.  I suspect it is
> only used when methods are called with paths in them, so performance
> improvement might not be that noticeable.  If, for example, we are

Passing a path can go from costing about a 1/10th of a second to about one
microsecond. This is something you'll notice :)

(Just not on that update code path we were talking about on the users list
:( )

r35731 and r35732 should have more impact there. (Should be about 5% on
Windows; probably more on normal PCs). But fixing generic issues don't help
bridge the difference between Linux and Windows :)
Not that I mind.. Going fast enough on all OSs would be just fine :)

> doing a merge or update the overwhelming majority of the code is
> happening inside the SVN libraries, there is not a lot of back and
> forth from Java while these run.  So the kind of performance speed-ups
> you did before, where it is being used within the libraries, is likely
> to have a much bigger impact than this.  But like I said, every bit
> helps and there might still be cases where we are doing things like
> checking Status on a large working copy where this will make a nice
> difference.

Don't underestimate the performance impact of the notify handler! (Comparing
AnkhSVN/SharpSvn to TortoiseSVN, this was in most cases the explanation why
AnkhSVN 2.0 was faster (and AnkhSVN 1.X was slower) than TortoiseSVN). If
the gui repaints in the notify handler, the TCP/IP layer slows down all
network IO because the client can't handle the data fast enough.
(The slow scroll effects in XP/Vista can be a killer here, unless disabled
by the code)

Moving the long-running subversion actions to a background thread and doing
the UI updates asynchronous can really make a difference.

Responding to the thread on users@:
I'm looking whether implementing the svn_stream_t Windows specifically might
make a difference. (apr_file_open allocates more than a kilobyte of data for
asynchronous IO, thread synchronization, etc., never used by Subversion)

	Bert
> 
> Thanks
> 
> --
> Thanks
> 
> Mark Phippard
> http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1123522

Re: [Client developers HEADS-UP] Penalty of APR_FILEPATH_TRUENAME (Was RE: svn commit: r35733 - trunk/subversion/bindings/javahl/native)

Posted by Mark Phippard <ma...@gmail.com>.
On Sat, Feb 7, 2009 at 8:36 AM, Bert Huijben <rh...@sharpsvn.net> wrote:
>> Log:
>> * subversion/bindings/javahl/native/JNIUtil.cpp
>>   (JNIUtil::preprocessPath): Remove a gigantic Windows only performance
>>     penalty by using the always correct svn_path_internal_style()
>> instead
>>     of the hidden breaking apr_filepath_merge with
>> APR_FILEPATH_TRUENAME
>>     that only did validation on Windows.
>
>        Hi,
>
> Until recently some parts of Subversion used apr_filepath_merge with the
> APR_FILEPATH_TRUENAME flag all over the place. (It currently only uses it
> for paths passed to svn via the commandline where it is actually usefull for
> catching user typos).
>
> Using this method is VERY VERY expensive on Windows. Our (and your) code
> should canonicalize paths only once. Exactly when passed from an untrusted
> source and after that assume that paths are canonical (or at least correctly
> cased).
>
> (Apr ignores this flag on all OSs except Windows, so as Linux developer you
> would never know.)
>
>
> I would recommend that other subversion client developers review their code
> to make sure they don't accidentally call this same method without realizing
> that this is expensive and you can get a fILE back when you gave the method
> a File.
>
>
> In 99.9% of all cases this normalization does nothing, except for costing a
> lot of time; in that other 0.1 percent (where the file on disk has another
> casing than subversion thinks it has), it makes it impossible for you to
> call subversion commands on the administratively managed file, as it would
> automatically 'repair' your 'File' to 'fILE'.
>
> Thanks for your attention,

I am in favor of the change.  I'll just have to keep my eyes open for
whether this causes any problems for us in Subclipse on Windows.

Did you do any kind of searching to get an idea where this function is
used?  For example, it is not something that is exposed to us on the
Java side, it is used internally in the C++ code.  I suspect it is
only used when methods are called with paths in them, so performance
improvement might not be that noticeable.  If, for example, we are
doing a merge or update the overwhelming majority of the code is
happening inside the SVN libraries, there is not a lot of back and
forth from Java while these run.  So the kind of performance speed-ups
you did before, where it is being used within the libraries, is likely
to have a much bigger impact than this.  But like I said, every bit
helps and there might still be cases where we are doing things like
checking Status on a large working copy where this will make a nice
difference.

Thanks

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1119968