You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Mark Phippard <ma...@gmail.com> on 2010/09/02 18:11:07 UTC

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

On Thu, Sep 2, 2010 at 2:04 PM,  <hw...@apache.org> wrote:
> Author: hwright
> Date: Thu Sep  2 18:04:32 2010
> New Revision: 992041
>
> URL: http://svn.apache.org/viewvc?rev=992041&view=rev
> Log:
> JavaHL: Make a couple of URI fields of the public java.net.URI type.

I don't agree with this at all.  I hope you do not plan to run through
the API and make more of this.

In this case, there is zero benefit to doing this and I do not have
any confidence you have done the investigation needed to know if there
are any detriments.  Do you KNOW that this class will accept all
possible values SVN will give it?

That said, this class only provides data from the API to the caller.
The caller is going to prefer String.  Why make them run URI.toString?

The only place where stuff like this would make sense would be on
Input to our API.  So that users knew what they were expected to
provide.  As I said on IRC, I think it would be a mistake to do this.
String is easier.  In Subclipse we use File or URL in our API, and it
is a pain to maintain.  I wish we had just used String like JavaHL.

I think you should revert this.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

Posted by Branko Čibej <br...@xbc.nu>.
 On 07.09.2010 15:42, Hyrum K. Wright wrote:
> On Thu, Sep 2, 2010 at 2:46 PM, Mark Phippard <ma...@gmail.com> wrote:
>> On Thu, Sep 2, 2010 at 3:17 PM, Mark Phippard <ma...@gmail.com> wrote:
>>> I would not be so sure.  When you are dealing with a GUI, adding a
>>> handful of milliseconds to something that is going to be done
>>> thousands of times can add up to less responsiveness.
>> I did some micro-benchmarks and the URI class does not perform too
>> badly -- although it is clearly slower.  I did a loop to construct a
>> million objects and see how long it took:
>>
>> String class: 0.417
>> URI class:    3.968
>>
>> I then took the loop and added a call to a method that took a String
>> and just did a couple of simple checks of the String to make sure the
>> compiler was not optimizing away any problems.  That gives:
>>
>> String class: 0.455
>> URI class:    4.003
>>
>> That said, in this example, if this were someone retrieving History
>> entries that would be 4 more seconds added to the UI wait.  Granted it
>> might have been a minute or more to get these million entries from the
>> server, but it is still added time.
> While I don't agree with your reasoning, I've reverted r992041 in r993356.

I'm far from being an avid Java hacker, but I've been down the
String-vs.-File-and-URL road a couple times myself. While using the
appropriate types gives on a warm feeling about the code being more
bullet-proof, that actually isn't so true; and the downsides -- i.e.,
more API endpoints and less performance -- are not trivial.

I think the main reason for this is that File and URL are not very well
thought out, and despite all the improvements in newer versions of the
VM, youre still doubling the number of objects you create for every
string that you use. I blame heap fragmentation.

-- Brane

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Thu, Sep 2, 2010 at 2:46 PM, Mark Phippard <ma...@gmail.com> wrote:
> On Thu, Sep 2, 2010 at 3:17 PM, Mark Phippard <ma...@gmail.com> wrote:
>> I would not be so sure.  When you are dealing with a GUI, adding a
>> handful of milliseconds to something that is going to be done
>> thousands of times can add up to less responsiveness.
>
> I did some micro-benchmarks and the URI class does not perform too
> badly -- although it is clearly slower.  I did a loop to construct a
> million objects and see how long it took:
>
> String class: 0.417
> URI class:    3.968
>
> I then took the loop and added a call to a method that took a String
> and just did a couple of simple checks of the String to make sure the
> compiler was not optimizing away any problems.  That gives:
>
> String class: 0.455
> URI class:    4.003
>
> That said, in this example, if this were someone retrieving History
> entries that would be 4 more seconds added to the UI wait.  Granted it
> might have been a minute or more to get these million entries from the
> server, but it is still added time.

While I don't agree with your reasoning, I've reverted r992041 in r993356.

-Hyrum

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Sep 2, 2010 at 3:17 PM, Mark Phippard <ma...@gmail.com> wrote:
> I would not be so sure.  When you are dealing with a GUI, adding a
> handful of milliseconds to something that is going to be done
> thousands of times can add up to less responsiveness.

I did some micro-benchmarks and the URI class does not perform too
badly -- although it is clearly slower.  I did a loop to construct a
million objects and see how long it took:

String class: 0.417
URI class:    3.968

I then took the loop and added a call to a method that took a String
and just did a couple of simple checks of the String to make sure the
compiler was not optimizing away any problems.  That gives:

String class: 0.455
URI class:    4.003

That said, in this example, if this were someone retrieving History
entries that would be 4 more seconds added to the UI wait.  Granted it
might have been a minute or more to get these million entries from the
server, but it is still added time.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Sep 2, 2010 at 3:05 PM, Hyrum K. Wright
<hy...@mail.utexas.edu> wrote:

>>> After our conversation on IRC, I did write a short test program to
>>> better investigate what the Java URI class accepted or balked at.  I
>>> discovered (empirically, so definitely not completely comprehensive)
>>> that the URI class handles the various schemes we support just fine.
>>> The documentation also supports this: "a URI instance is little more
>>> than a structured string that supports the syntactic,
>>> scheme-independent operations of comparison, normalization,
>>> resolution, and relativization."
>>
>> What about performance implications?  Depending how far you take this.
>>  If I were retrieving log messages, with thousands of paths, I would
>> not want to introduce any extra clock cycles just so that SVN can
>> convert its string into some object it thinks I might want and so that
>> I can convert it back.  Reading the JavaDoc for URI, it seems like it
>> must do some parsing of the string so I wonder how lightweight it is.
>
> If you are retrieving log messages with thousands of paths, the
> network latencies and other Subversion operations will vastly dominate
> transaction times.  Whatever overhead you've got in doing whatever
> parsing the URI class does is minuscule in comparison.  (Also, there
> isn't any URI-specific fields returned through log, so that example
> isn't particularly valid.  Most of the places we exclusively provide
> URIs are not invoked inside loops.)

I would not be so sure.  When you are dealing with a GUI, adding a
handful of milliseconds to something that is going to be done
thousands of times can add up to less responsiveness.

Also, I specifically acknowledged this API has not been changed.  I am
asking you your intent.  The API you changed so far is meaningless to
me, but I am assuming you have every intent of changing every API we
have as that has been the previous pattern and I am registering my
objections now before you invest the effort.

>>>> I think you should revert this.
>>>
>>> I understand your feelings.  Subclipse is a very large consumer of the
>>> JavaHL API, and the bindings have largely evolved to meet the needs of
>>> this (very large) consumer.  However, I'd like to hear what other
>>> opinions folks may have before I make any other changes (both
>>> reverting this change, as well as extending others in a similar vein).
>>
>> I do not agree with you there is any user value here.  I am interested
>> to know the full scope you intend to take this.  I think people are
>> just going to need to take the URL convert it to a String and then
>> maybe convert the String into what they wanted (though I suspect most
>> will want String).
>>
>> I do not see why we would return URI if we do not accept URI into the API.
>
> We return *much* more information that we accept into the API.

Of course.  But very few of these items you refer to are ever fed back
into the API.  Again, I ask the intent of how far you plan to go with
this?  Why would you change one API if you do not plan changing all of
them?  If all the API's return URI's then soon you will want to change
the input as was discussed yesterday.  Or at a minimum it is going to
bother you that we return URI but ask for String.


> I'm not opposed to making similar
> changes to the input APIs (such as with the upgrade API).  As an API
> consumer, I'd much rather find out about a type-based error at
> compile-time than run-time, especially when my language of choice
> provides that capability.

I understand and on a micro level this makes sense.  When we took this
approach in Subclipse it certainly looked like an improvement on the
API to use classes like File and URL in place of String.  When you
look at one single Interface like ISVNClient that already has a ton of
methods, and then you start taking methods like copy and adding a
variant for every combination of File and URL it can accept it gets
ugly quick.  Factor in future releases where we rev the API and it
gets uglier.  In my usage of the API I also get little benefit as I
tend to have to convert what I am working with to a File or URL before
calling the API so it would be easier if it had just been String.

> Again, in an effort to understand the other implications of these
> types of changes, I'm interested to hear what thoughts others have on
> this subject.

Same here.  I'll just repeat that I do not think it is going to help
someone using our API to know they are going to receive URI in place
of String.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Thu, Sep 2, 2010 at 1:42 PM, Mark Phippard <ma...@gmail.com> wrote:
> On Thu, Sep 2, 2010 at 2:33 PM, Hyrum K. Wright
> <hy...@mail.utexas.edu> wrote:
>
>> After our conversation on IRC, I did write a short test program to
>> better investigate what the Java URI class accepted or balked at.  I
>> discovered (empirically, so definitely not completely comprehensive)
>> that the URI class handles the various schemes we support just fine.
>> The documentation also supports this: "a URI instance is little more
>> than a structured string that supports the syntactic,
>> scheme-independent operations of comparison, normalization,
>> resolution, and relativization."
>
> What about performance implications?  Depending how far you take this.
>  If I were retrieving log messages, with thousands of paths, I would
> not want to introduce any extra clock cycles just so that SVN can
> convert its string into some object it thinks I might want and so that
> I can convert it back.  Reading the JavaDoc for URI, it seems like it
> must do some parsing of the string so I wonder how lightweight it is.

If you are retrieving log messages with thousands of paths, the
network latencies and other Subversion operations will vastly dominate
transaction times.  Whatever overhead you've got in doing whatever
parsing the URI class does is minuscule in comparison.  (Also, there
isn't any URI-specific fields returned through log, so that example
isn't particularly valid.  Most of the places we exclusively provide
URIs are not invoked inside loops.)

>>> I think you should revert this.
>>
>> I understand your feelings.  Subclipse is a very large consumer of the
>> JavaHL API, and the bindings have largely evolved to meet the needs of
>> this (very large) consumer.  However, I'd like to hear what other
>> opinions folks may have before I make any other changes (both
>> reverting this change, as well as extending others in a similar vein).
>
> I do not agree with you there is any user value here.  I am interested
> to know the full scope you intend to take this.  I think people are
> just going to need to take the URL convert it to a String and then
> maybe convert the String into what they wanted (though I suspect most
> will want String).
>
> I do not see why we would return URI if we do not accept URI into the API.

We return *much* more information that we accept into the API.  When
returning information, we also *know* what type of information we
have, rather than having go guess based upon the open-ended input
APIs.  In cases where we *know* that we accept exclusively working
copy paths or repository URIs, I'm not opposed to making similar
changes to the input APIs (such as with the upgrade API).  As an API
consumer, I'd much rather find out about a type-based error at
compile-time than run-time, especially when my language of choice
provides that capability.

Again, in an effort to understand the other implications of these
types of changes, I'm interested to hear what thoughts others have on
this subject.

-Hyrum

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Sep 2, 2010 at 2:33 PM, Hyrum K. Wright
<hy...@mail.utexas.edu> wrote:

> After our conversation on IRC, I did write a short test program to
> better investigate what the Java URI class accepted or balked at.  I
> discovered (empirically, so definitely not completely comprehensive)
> that the URI class handles the various schemes we support just fine.
> The documentation also supports this: "a URI instance is little more
> than a structured string that supports the syntactic,
> scheme-independent operations of comparison, normalization,
> resolution, and relativization."

What about performance implications?  Depending how far you take this.
 If I were retrieving log messages, with thousands of paths, I would
not want to introduce any extra clock cycles just so that SVN can
convert its string into some object it thinks I might want and so that
I can convert it back.  Reading the JavaDoc for URI, it seems like it
must do some parsing of the string so I wonder how lightweight it is.

>> I think you should revert this.
>
> I understand your feelings.  Subclipse is a very large consumer of the
> JavaHL API, and the bindings have largely evolved to meet the needs of
> this (very large) consumer.  However, I'd like to hear what other
> opinions folks may have before I make any other changes (both
> reverting this change, as well as extending others in a similar vein).

I do not agree with you there is any user value here.  I am interested
to know the full scope you intend to take this.  I think people are
just going to need to take the URL convert it to a String and then
maybe convert the String into what they wanted (though I suspect most
will want String).

I do not see why we would return URI if we do not accept URI into the API.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: svn commit: r992041 - in /subversion/trunk/subversion/bindings/javahl: native/CreateJ.cpp src/org/apache/subversion/javahl/CommitItem.java src/org/tigris/subversion/javahl/CommitItem.java tests/org/apache/subversion/javahl/SVNTests.java

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Thu, Sep 2, 2010 at 1:11 PM, Mark Phippard <ma...@gmail.com> wrote:
> On Thu, Sep 2, 2010 at 2:04 PM,  <hw...@apache.org> wrote:
>> Author: hwright
>> Date: Thu Sep  2 18:04:32 2010
>> New Revision: 992041
>>
>> URL: http://svn.apache.org/viewvc?rev=992041&view=rev
>> Log:
>> JavaHL: Make a couple of URI fields of the public java.net.URI type.
>
> I don't agree with this at all.  I hope you do not plan to run through
> the API and make more of this.
>
> In this case, there is zero benefit to doing this and I do not have
> any confidence you have done the investigation needed to know if there
> are any detriments.  Do you KNOW that this class will accept all
> possible values SVN will give it?

After our conversation on IRC, I did write a short test program to
better investigate what the Java URI class accepted or balked at.  I
discovered (empirically, so definitely not completely comprehensive)
that the URI class handles the various schemes we support just fine.
The documentation also supports this: "a URI instance is little more
than a structured string that supports the syntactic,
scheme-independent operations of comparison, normalization,
resolution, and relativization."

> That said, this class only provides data from the API to the caller.
> The caller is going to prefer String.  Why make them run URI.toString?

I understand that Subclipse might prefer String, but that does not
mean that all callers do.  In general, I think it wisest to use the
type system, as well as the documentation, to communicate with API
users about what types of data they can expect from the API.  We are
working with a language which innately supports a URI datatype (unlike
the core libraries), and I struggle to see why adding more type
information is a bad thing.

> The only place where stuff like this would make sense would be on
> Input to our API.  So that users knew what they were expected to
> provide.  As I said on IRC, I think it would be a mistake to do this.
> String is easier.  In Subclipse we use File or URL in our API, and it
> is a pain to maintain.  I wish we had just used String like JavaHL.

Per our IRC conversation, I agree that it is unwise to convert Strings
which may represent either a local path of a remote location.
However, in cases where anything other than a well-formed URI (or
null) is valid, it makes sense to use a more specific type.  I've no
plans to change cases where a String could represent either a local
path or a remote URL.

> I think you should revert this.

I understand your feelings.  Subclipse is a very large consumer of the
JavaHL API, and the bindings have largely evolved to meet the needs of
this (very large) consumer.  However, I'd like to hear what other
opinions folks may have before I make any other changes (both
reverting this change, as well as extending others in a similar vein).

Best,
-Hyrum