You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Mark Hindess <ma...@googlemail.com> on 2008/04/22 19:02:19 UTC

Re: [classlib][luni][performance] IdentityHashMap implementation

On 22 April 2008 at 15:23, "Aleksey Shipilev" <al...@gmail.com> 
wrote:
> Tim,
> 
> I support your proposal and try to make distinction between
> proof-of-concept and clean patches.
> 
> Performance improvements could be notorious work, requiring a lot of
> resources. The bad thing is, they could give nothing after the work is
> done, that's why we are doing thought experiments and proof-of-concept
> patches before actually implementing something. Proof-of-concept may
> also give a hint on what maximum improvement we can get and set the
> expectations for clean patch.
> 
> So we are not looking for hiding the problems, rather we're exploring
> what could it worth.
> 
> F.ex. the experiments on IdentityHashMap performance finally led to
> clean, non-breaking, consistent implementation, ready to be committed
> patch [1]. Can we focus on reviewing it?

"clean [snip] patch"?  Okay, I'll take the bait...

1) It is good to fix the assert in the test case, but if you are going to
change it at all you might as well replace:

  assertTrue("TestB. removed entries incorrectly", map.size() == 11 && 
!result);

with:

  assertEquals("TestB. removed entries incorrectly, size is not correct",
               11, map.size());
  assertFalse("TestB. removed entries incorrectly, result is not correct",
              result);


2) I think the IdentityHashMap javadoc comment change removes too much
important information.


3) I don't understand why you make arbitrary whitespace changes - such
as rewriting:

public class IdentityHashMap<K, V> extends AbstractMap<K, V> implements
        Map<K, V>, Serializable, Cloneable {

to be longer than 80-characters.


4) You rename DEFAULT_MAX_SIZE to DEFAULT_SIZE but I don't really see
the relevance of this to the changes.


5) You seem to remove a number of correct comments from the
declarations of threshold, DEFAULT_MAX_SIZE, loadFactor, and modCount.

I almost I gave up reading the patch at this point out of frustration
at the number of irrelevant changes.  Such as:

a) the re-ordering of often unchanged public methods - previously the
public methods were grouped by similar functionality which seems logical
to me but now they are in alphabetical order.  This makes the patch much
bigger and harder to read.  If there is a good reason to re-order the
methods, do it in a separate patch.


b) The arbitrary changes to javadoc like:

     /**
-     * Searches this Map for the specified value.
-     * 
-     * 
+     * Searches this IdentityHashMap for the specified value.
+     *
      * @param value
      *            the object to search for
-     * @return true if <code>value</code> is a value of this Map, false
+     * @return true if <code>value</code> is a value of this IdentityHashMap, 
false
      *         otherwise
      */

c) The arbitrary renaming of IdentityHashMapEntry to Entry.

Is it unrealistic to expect clean patches with justifications for
changes?  Can we optimise patches as well as code?

If I apply the patch and then revert the whitespace changes,
javadoc reformatting, renaming of DEFAULT_MAX_SIZE, renaming of
IdentityHashMapEntry, etc. the the resulting svn diff contains at least
33% fewer +/- lines than the one on the JIRA and is still functionally
the same.

As someone who tries to read svn commit messages (and all committers or
would-be-committers should be doing this to some extent!) patches like
this are a nightmare.

Regards,
-Mark.

> 
> Thanks,
> Aleksey.
> 
> [1] https://issues.apache.org/jira/browse/HARMONY-5771
> 
> On Tue, Apr 22, 2008 at 12:09 PM, Tim Ellison <t....@gmail.com> wrote:
> > Aleksey Shipilev wrote:
> >
> > > Right, Tim. It was just the proof-of-concept "Does IdentityHashMap
> > > tuning worth it?".
> > >
> >
> >  I think it is unhelpful to mix actual, concrete suggestions for performanc
> e
> > improvements, with proof-of-concept ideas that break the specification or
> > security.
> >
> >  Please can people mark clearly in their mail whether they are suggesting
> > the idea only as "a thought experiment", otherwise somebody may apply the
> > change without properly understanding the consequences.
> >
> >  I realize the onus is on the committer to ensure it is safe but hiding suc
> h
> > problems is, as I said, unhelpful.
> >
> >  Thanks,
> >  Tim
> >
> >
> >
> >
> >
> > > On Mon, Apr 21, 2008 at 4:50 PM, Tim Ellison <t....@gmail.com>
> > wrote:
> > >
> > > > Sergey Salishev wrote:
> > > >
> > > >
> > > > > It also should be noticed that replacing the IdentityHashMap with
> > HashMap
> > > > >
> > > > in
> > > >
> > > > > Thread impl gives 4.5x boost on ThreadLocalBench.
> > > > >
> > > > >
> > > >  But it would be wrong since it opens a security vulnerability.
> > > >
> > > >  See:
> > > >  http://markmail.org/message/rwq3kif5n33wp65m
> > > >  http://svn.apache.org/viewvc?view=rev&revision=477355
> > > >
> > > >
> > > >  Regards,
> > > >  Tim
> > > >
> > > >
> > >
> > >
> >
> 



Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Alexey Petrenko <al...@gmail.com>.
Aleksey,

I would support Mark here. Clear patch is very important because it
makes our lives much easier and we are not required to skip 70K of
irrelevant text to see 2 relevant lines :)
So I think that your idea is good but should be better delivered :)

Take a look at "Good Issue Resolution Guideline" for example.
http://harmony.apache.org/issue_resolution_guideline.html

SY, Alexey

2008/4/23, Mark Hindess <ma...@googlemail.com>:
>
> On 23 April 2008 at 0:36, "Aleksey Shipilev" <al...@gmail.com>
> wrote:
> > Hi Endre,
> >
> > On Tue, Apr 22, 2008 at 11:30 PM, Endre St=F8lsvik <En...@stolsvik.com> wro=
> > te:
> > > Aleksey Shipilev wrote:
> > > > The reason behind all that changes is that entire IdentityHashMap
> > > > implementation was thrown away and replaced by HashMap
> > >
> > >  Isn't it possible to actually record this fact using SVN, by
> > > deleting the file, and then adding it again (or svn copy it from
> > > HashMap) - so that it doesn't look like a *change*, but more what it
> > > actually is: a remove, and then an add (actually, a copy)?
> >
> > Unfortunately, that's not usable, you might play around to see why. If
> > you find a solution, please let me know :)
>
> Fortunately, it is not compulsory to create patches with "svn diff".
> I've just done:
>
> 1) apply your patch to a fresh checkout
> 2) cp modules/luni/src/main/java/java/util/HashMap.java \
>      modules/luni/src/main/java/java/util/IdentityHashMap.java.orig
> 3) diff -u modules/luni/src/main/java/java/util/IdentityHashMap.java.orig \
>           modules/luni/src/main/java/java/util/IdentityHashMap.java
>
> The resulting patch would be much more suitable for attachment to a JIRA.
>
> I'd still fix a few things about HashMap.java first though.
> -Mark.
>
>
>

Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Mark Hindess <ma...@googlemail.com>.
On 23 April 2008 at 0:36, "Aleksey Shipilev" <al...@gmail.com> 
wrote:
> Hi Endre,
> 
> On Tue, Apr 22, 2008 at 11:30 PM, Endre St=F8lsvik <En...@stolsvik.com> wro=
> te:
> > Aleksey Shipilev wrote:
> > > The reason behind all that changes is that entire IdentityHashMap
> > > implementation was thrown away and replaced by HashMap
> >
> >  Isn't it possible to actually record this fact using SVN, by
> > deleting the file, and then adding it again (or svn copy it from
> > HashMap) - so that it doesn't look like a *change*, but more what it
> > actually is: a remove, and then an add (actually, a copy)?
>
> Unfortunately, that's not usable, you might play around to see why. If
> you find a solution, please let me know :)

Fortunately, it is not compulsory to create patches with "svn diff".
I've just done:

1) apply your patch to a fresh checkout
2) cp modules/luni/src/main/java/java/util/HashMap.java \
      modules/luni/src/main/java/java/util/IdentityHashMap.java.orig
3) diff -u modules/luni/src/main/java/java/util/IdentityHashMap.java.orig \
           modules/luni/src/main/java/java/util/IdentityHashMap.java

The resulting patch would be much more suitable for attachment to a JIRA.

I'd still fix a few things about HashMap.java first though.
-Mark.



Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Aleksey Shipilev <al...@gmail.com>.
Hi Endre,

On Tue, Apr 22, 2008 at 11:30 PM, Endre Stølsvik <En...@stolsvik.com> wrote:
> Aleksey Shipilev wrote:
> > The reason behind all that changes is that entire IdentityHashMap
> > implementation was thrown away and replaced by HashMap
>  Isn't it possible to actually record this fact using SVN, by deleting the
> file, and then adding it again (or svn copy it from HashMap) - so that it
> doesn't look like a *change*, but more what it actually is: a remove, and
> then an add (actually, a copy)?
Unfortunately, that's not usable, you might play around to see why. If
you find a solution, please let me know :)

The reason is, svn diff is preparing the patch by diffing
IdentityHashMap.java and IdentityHashMap.java.svn-base, so it does not
matter how would you replace the contents of IdentityHashMap.java -
the results will be the same patch. Moreover, if you'll do "svn rm /
svn copy" the history for HashMap.java will be forwarded to
IdentityHashMap.java and "svn diff" will not generate anything.
Despite the fact this change might be more clean while committing (and
probably should be exploited by committer), it's hard for
non-committer to share the patch in JIRA. I don't know though whether
such change will discard the IdentityHashMap modification history.


Thanks,
Aleksey.

Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Endre Stølsvik <En...@stolsvik.com>.
Aleksey Shipilev wrote:
> Good comments, Mark, thanks! But I have to chill your fury a little :)
> 
> The reason behind all that changes is that entire IdentityHashMap
> implementation was thrown away and replaced by HashMap

Isn't it possible to actually record this fact using SVN, by deleting 
the file, and then adding it again (or svn copy it from HashMap) - so 
that it doesn't look like a *change*, but more what it actually is: a 
remove, and then an add (actually, a copy)?

Endre.

Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Nathan Beyer <nd...@apache.org>.
On Tue, Apr 22, 2008 at 4:03 PM, Mark Hindess <ma...@googlemail.com>
wrote:

>
> On 22 April 2008 at 23:02, "Aleksey Shipilev" <al...@gmail.com>
> >
> wrote:
> >
> > Good comments, Mark, thanks! But I have to chill your fury a little :)
>
> I'm might be a little bothered but I'm certainly not furious.  I just
> wish to point out issues so that we can make things better.  I'd not
> want to see the patch applied in its current form but I've already
> moved on from the "clean" issue - by proving to myself that it could be
> cleaned up - and I'm looking at the "non-breaking" part now.
>
> > The reason behind all that changes is that entire IdentityHashMap
> > implementation was thrown away and replaced by HashMap, then the
> > relevant changes that transform it back to IdentityHashMap were done
> > (changing hashCode() to System.identityHashCode(), equals to ==, etc).
>
> sigh... this should really have been made clear in the JIRA - see below
> for an example of how this could be done.
>
> > So if you diff the HashMap.java and IdentityHashMap.java after the
> > patch applied, you should notice very small difference. That's a
> > good sign because thus we can easily identify and generalize the
> > common code in both implementations, as we suggested to do. That's the
> > generalization we talked about with Nathan and Tim, right?
>
> Well, I've not looked at HashMap.java but I suspect that it might have
> methods ordered alphabetically too which is horrible and should be
> fixed.
>
> Also, when I reverted the spurious javadoc changes back to the
> original form, I replaced a number of instances of the specific word
> IdentityHashMap with the less specific word Map.  Given the context
> doesn't detract from the meaning and does make it easier to read.
> Rather than make the IdentityHashMap javadoc more like the HashMap
> javadoc I suggest we change the HashMap javadoc in this way too.  (This
> is a personal preference but it would make the significant differences
> between HashMap.java and IdentityHashMap.java much more obvious.)
>
> In short, making them similar is great but lets make them both better
> rather than make IdentityHashMap worse.


I agree with this approach; clean HashMap, copy to IdentityHashMap, make
Identity-specific changes.


>
>
> > Keeping this in mind, you can see that the assumption of "minimal
> > change" can't be preserved
>
> Nonsense.  I have a 33% smaller cleaner patch created in ~10 minutes
> using emacs and ediff mode to prove it.  And armed with the extra
> information that was missing from the JIRA I can suggest an even better
> way see below.
>
> > and reading out the patch itself is pretty useless.
>
> I disagree; I read it and I don't think it was a useless exercise.  If
> my reading of the patch resulted in what you referred to above as "good
> comments" then clearly it wasn't useless.
>
> Of course, it is not the whole story but then I was never under the
> illusion that it was and I am moving on to look at the result of the
> patch next.  Personally, I think people underestimate the value of
> reading raw patches.  You do get to see just the changes - in a clean
> patch anyway - and "bad smells"[0] are often easier to see this way.
>

Agreed. This is something that I didn't quite get at first, but after
working with several different projects and being committer here, the power
of patches and diffs has certainly won me over. I have a much greater
respect for SCM tools now.

I love the "code smells" reference -- this is a turn-of-phrase that I
frequently use.

-Nathan


>
> > Rather, it makes sense to apply the patch to local copy and then
> > see whether the differences between IHM and HM are enough, whether
> > resulting IHM implementation contradicts with spec, whether IHM passes
> > the tests and runs the workloads well and stuff.
>
> I'm getting to that part.  You stated that the patch was "clean,
> non-breaking, consistent implementation, ready to be committed" and I
> was testing those assertions in that order.  Although I've probably made
> my mind up about the last one too.
>
> > Of course, to avoid these confusions I can rewrite entire Collections
> > at once and then contribute a mega-patch, but I don't think this is
> > the proper way of doing things.
>
> That's an interesting goal but I don't believe mega-patches are required
> to achieve it.  Before I became a committer, I did a huge amount of
> work unifying the native code and I did it in *lots* of small steps.
> Each step was clear.  I sometimes submitted a series of patches (and/or
> scripts) for a single JIRA to make the steps as transparent as possible.
> For example, for HARMONY-5771, you might do something like:
>
>  1) Attached a script (or documented) doing:
>
>    svn copy modules/luni/src/main/java/java/util/HashMap.java \
>    modules/luni/src/main/java/java/util/IdentityHashMap.java
>
>  2) Attach a patch to be applied after the copy with the changes.
>
> I think what you had done would be much clearer and I might not have
> started this discussion.  Certainly reading the patch would have been
> much easier and the *important* changes that really need reviewing would
> have been obvious not cluttered like the current patch.[1]
>
> Having said that, I think that would still be the wrong thing to do in
> this case and you'd need:
>
>  0) Attach a patch to fix some ugly things about HashMap
>
> to avoid duplicating the IMHO less wonderful aspects of HashMap in
> IdentityHashMap.
>
> > I prefer gradual updates instead.
>
> Well, at least, we can agree about that. ;-)
>
> -Mark.
>
> [0]
> http://sis36.berkeley.edu/projects/streek/agile/bad-smells-in-code.html
>
> [1] The svn diff on the commits list would still be a mess but hopefully
> the committer would use a log message that made it clear there was a
> more readable version in the JIRA.
>
> > On Tue, Apr 22, 2008 at 9:02 PM, Mark Hindess
> > <ma...@googlemail.com>>
> wrote:
> > >
> > >  On 22 April 2008 at 15:23, "Aleksey Shipilev" <
> aleksey.shipilev@gmail.com<ht...@gmail.com>
> >
> > >  wrote:
> > >
> > > > Tim,
> > >  >
> > >  > I support your proposal and try to make distinction between
> > >  > proof-of-concept and clean patches.
> > >  >
> > >  > Performance improvements could be notorious work, requiring a lot
> of
> > >  > resources. The bad thing is, they could give nothing after the work
> is
> > >  > done, that's why we are doing thought experiments and
> proof-of-concept
> > >  > patches before actually implementing something. Proof-of-concept
> may
> > >  > also give a hint on what maximum improvement we can get and set the
> > >  > expectations for clean patch.
> > >  >
> > >  > So we are not looking for hiding the problems, rather we're
> exploring
> > >  > what could it worth.
> > >  >
> > >  > F.ex. the experiments on IdentityHashMap performance finally led to
> > >  > clean, non-breaking, consistent implementation, ready to be
> committed
> > >  > patch [1]. Can we focus on reviewing it?
> > >
> > >  "clean [snip] patch"?  Okay, I'll take the bait...
> > >
> > >  1) It is good to fix the assert in the test case, but if you are
> going to
> > >  change it at all you might as well replace:
> > >
> > >   assertTrue("TestB. removed entries incorrectly", map.size() == 11 &&
> > >  !result);
> > >
> > >  with:
> > >
> > >   assertEquals("TestB. removed entries incorrectly, size is not
> correct",
> > >                11, map.size());
> > >   assertFalse("TestB. removed entries incorrectly, result is not
> correct",
> > >               result);
> > >
> > >
> > >  2) I think the IdentityHashMap javadoc comment change removes too
> much
> > >  important information.
> > >
> > >
> > >  3) I don't understand why you make arbitrary whitespace changes -
> such
> > >  as rewriting:
> > >
> > >  public class IdentityHashMap<K, V> extends AbstractMap<K, V>
> implements
> > >         Map<K, V>, Serializable, Cloneable {
> > >
> > >  to be longer than 80-characters.
> > >
> > >
> > >  4) You rename DEFAULT_MAX_SIZE to DEFAULT_SIZE but I don't really see
> > >  the relevance of this to the changes.
> > >
> > >
> > >  5) You seem to remove a number of correct comments from the
> > >  declarations of threshold, DEFAULT_MAX_SIZE, loadFactor, and
> modCount.
> > >
> > >  I almost I gave up reading the patch at this point out of frustration
> > >  at the number of irrelevant changes.  Such as:
> > >
> > >  a) the re-ordering of often unchanged public methods - previously the
> > >  public methods were grouped by similar functionality which seems
> logical
> > >  to me but now they are in alphabetical order.  This makes the patch
> much
> > >  bigger and harder to read.  If there is a good reason to re-order the
> > >  methods, do it in a separate patch.
> > >
> > >
> > >  b) The arbitrary changes to javadoc like:
> > >
> > >      /**
> > >  -     * Searches this Map for the specified value.
> > >  -     *
> > >  -     *
> > >  +     * Searches this IdentityHashMap for the specified value.
> > >  +     *
> > >       * @param value
> > >       *            the object to search for
> > >  -     * @return true if <code>value</code> is a value of this Map,
> false
> > >  +     * @return true if <code>value</code> is a value of this
> IdentityHash
> > Map,
> > >  false
> > >       *         otherwise
> > >       */
> > >
> > >  c) The arbitrary renaming of IdentityHashMapEntry to Entry.
> > >
> > >  Is it unrealistic to expect clean patches with justifications for
> > >  changes?  Can we optimise patches as well as code?
> > >
> > >  If I apply the patch and then revert the whitespace changes,
> > >  javadoc reformatting, renaming of DEFAULT_MAX_SIZE, renaming of
> > >  IdentityHashMapEntry, etc. the the resulting svn diff contains at
> least
> > >  33% fewer +/- lines than the one on the JIRA and is still
> functionally
> > >  the same.
> > >
> > >  As someone who tries to read svn commit messages (and all committers
> or
> > >  would-be-committers should be doing this to some extent!) patches
> like
> > >  this are a nightmare.
> > >
> > >  Regards,
> > >  -Mark.
> > >
> > >
> > >
> > >  >
> > >  > Thanks,
> > >  > Aleksey.
> > >  >
> > >  > [1] https://issues.apache.org/jira/browse/HARMONY-5771
> > >  >
> > >  > On Tue, Apr 22, 2008 at 12:09 PM, Tim Ellison <
> t.p.ellison@gmail.com<ht...@gmail.com>>
> wr
> > ote:
> > >  > > Aleksey Shipilev wrote:
> > >  > >
> > >  > > > Right, Tim. It was just the proof-of-concept "Does
> IdentityHashMap
> > >  > > > tuning worth it?".
> > >  > > >
> > >  > >
> > >  > >  I think it is unhelpful to mix actual, concrete suggestions for
> perfo
> > rmanc
> > >  > e
> > >  > > improvements, with proof-of-concept ideas that break the
> specification
> >  or
> > >  > > security.
> > >  > >
> > >  > >  Please can people mark clearly in their mail whether they are
> suggest
> > ing
> > >  > > the idea only as "a thought experiment", otherwise somebody may
> apply
> > the
> > >  > > change without properly understanding the consequences.
> > >  > >
> > >  > >  I realize the onus is on the committer to ensure it is safe but
> hidin
> > g suc
> > >  > h
> > >  > > problems is, as I said, unhelpful.
> > >  > >
> > >  > >  Thanks,
> > >  > >  Tim
> > >  > >
> > >  > >
> > >  > >
> > >  > >
> > >  > >
> > >  > > > On Mon, Apr 21, 2008 at 4:50 PM, Tim Ellison <
> t.p.ellison@gmail.com<ht...@gmail.com>
> >
> > >  > > wrote:
> > >  > > >
> > >  > > > > Sergey Salishev wrote:
> > >  > > > >
> > >  > > > >
> > >  > > > > > It also should be noticed that replacing the
> IdentityHashMap wit
> > h
> > >  > > HashMap
> > >  > > > > >
> > >  > > > > in
> > >  > > > >
> > >  > > > > > Thread impl gives 4.5x boost on ThreadLocalBench.
> > >  > > > > >
> > >  > > > > >
> > >  > > > >  But it would be wrong since it opens a security
> vulnerability.
> > >  > > > >
> > >  > > > >  See:
> > >  > > > >  http://markmail.org/message/rwq3kif5n33wp65m
> > >  > > > >  http://svn.apache.org/viewvc?view=rev&revision=477355
> > >  > > > >
> > >  > > > >
> > >  > > > >  Regards,
> > >  > > > >  Tim
> > >  > > > >
> > >  > > > >
> > >  > > >
> > >  > > >
> > >  > >
> > >  >
> > >
> > >
> > >
> >
>
>
>

Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Mark Hindess <ma...@googlemail.com>.
On 22 April 2008 at 23:02, "Aleksey Shipilev" <al...@gmail.com> 
wrote:
>
> Good comments, Mark, thanks! But I have to chill your fury a little :)

I'm might be a little bothered but I'm certainly not furious.  I just
wish to point out issues so that we can make things better.  I'd not
want to see the patch applied in its current form but I've already
moved on from the "clean" issue - by proving to myself that it could be
cleaned up - and I'm looking at the "non-breaking" part now.

> The reason behind all that changes is that entire IdentityHashMap
> implementation was thrown away and replaced by HashMap, then the
> relevant changes that transform it back to IdentityHashMap were done
> (changing hashCode() to System.identityHashCode(), equals to ==, etc).

sigh... this should really have been made clear in the JIRA - see below
for an example of how this could be done.

> So if you diff the HashMap.java and IdentityHashMap.java after the
> patch applied, you should notice very small difference. That's a
> good sign because thus we can easily identify and generalize the
> common code in both implementations, as we suggested to do. That's the
> generalization we talked about with Nathan and Tim, right?

Well, I've not looked at HashMap.java but I suspect that it might have
methods ordered alphabetically too which is horrible and should be
fixed.

Also, when I reverted the spurious javadoc changes back to the
original form, I replaced a number of instances of the specific word
IdentityHashMap with the less specific word Map.  Given the context
doesn't detract from the meaning and does make it easier to read.
Rather than make the IdentityHashMap javadoc more like the HashMap
javadoc I suggest we change the HashMap javadoc in this way too.  (This
is a personal preference but it would make the significant differences
between HashMap.java and IdentityHashMap.java much more obvious.)

In short, making them similar is great but lets make them both better
rather than make IdentityHashMap worse.

> Keeping this in mind, you can see that the assumption of "minimal
> change" can't be preserved

Nonsense.  I have a 33% smaller cleaner patch created in ~10 minutes
using emacs and ediff mode to prove it.  And armed with the extra
information that was missing from the JIRA I can suggest an even better
way see below.

> and reading out the patch itself is pretty useless.

I disagree; I read it and I don't think it was a useless exercise.  If
my reading of the patch resulted in what you referred to above as "good
comments" then clearly it wasn't useless.

Of course, it is not the whole story but then I was never under the
illusion that it was and I am moving on to look at the result of the
patch next.  Personally, I think people underestimate the value of
reading raw patches.  You do get to see just the changes - in a clean
patch anyway - and "bad smells"[0] are often easier to see this way.

> Rather, it makes sense to apply the patch to local copy and then
> see whether the differences between IHM and HM are enough, whether
> resulting IHM implementation contradicts with spec, whether IHM passes
> the tests and runs the workloads well and stuff.

I'm getting to that part.  You stated that the patch was "clean,
non-breaking, consistent implementation, ready to be committed" and I
was testing those assertions in that order.  Although I've probably made
my mind up about the last one too.

> Of course, to avoid these confusions I can rewrite entire Collections
> at once and then contribute a mega-patch, but I don't think this is
> the proper way of doing things.

That's an interesting goal but I don't believe mega-patches are required
to achieve it.  Before I became a committer, I did a huge amount of
work unifying the native code and I did it in *lots* of small steps.
Each step was clear.  I sometimes submitted a series of patches (and/or
scripts) for a single JIRA to make the steps as transparent as possible.
For example, for HARMONY-5771, you might do something like:

  1) Attached a script (or documented) doing:

    svn copy modules/luni/src/main/java/java/util/HashMap.java \
    modules/luni/src/main/java/java/util/IdentityHashMap.java

  2) Attach a patch to be applied after the copy with the changes.

I think what you had done would be much clearer and I might not have
started this discussion.  Certainly reading the patch would have been
much easier and the *important* changes that really need reviewing would
have been obvious not cluttered like the current patch.[1]

Having said that, I think that would still be the wrong thing to do in
this case and you'd need:

  0) Attach a patch to fix some ugly things about HashMap

to avoid duplicating the IMHO less wonderful aspects of HashMap in
IdentityHashMap.

> I prefer gradual updates instead.

Well, at least, we can agree about that. ;-)

-Mark.

[0] http://sis36.berkeley.edu/projects/streek/agile/bad-smells-in-code.html

[1] The svn diff on the commits list would still be a mess but hopefully
the committer would use a log message that made it clear there was a
more readable version in the JIRA.

> On Tue, Apr 22, 2008 at 9:02 PM, Mark Hindess
> <ma...@googlemail.com> wrote:
> >
> >  On 22 April 2008 at 15:23, "Aleksey Shipilev" <al...@gmail.com>
> >  wrote:
> >
> > > Tim,
> >  >
> >  > I support your proposal and try to make distinction between
> >  > proof-of-concept and clean patches.
> >  >
> >  > Performance improvements could be notorious work, requiring a lot of
> >  > resources. The bad thing is, they could give nothing after the work is
> >  > done, that's why we are doing thought experiments and proof-of-concept
> >  > patches before actually implementing something. Proof-of-concept may
> >  > also give a hint on what maximum improvement we can get and set the
> >  > expectations for clean patch.
> >  >
> >  > So we are not looking for hiding the problems, rather we're exploring
> >  > what could it worth.
> >  >
> >  > F.ex. the experiments on IdentityHashMap performance finally led to
> >  > clean, non-breaking, consistent implementation, ready to be committed
> >  > patch [1]. Can we focus on reviewing it?
> >
> >  "clean [snip] patch"?  Okay, I'll take the bait...
> >
> >  1) It is good to fix the assert in the test case, but if you are going to
> >  change it at all you might as well replace:
> >
> >   assertTrue("TestB. removed entries incorrectly", map.size() == 11 &&
> >  !result);
> >
> >  with:
> >
> >   assertEquals("TestB. removed entries incorrectly, size is not correct",
> >                11, map.size());
> >   assertFalse("TestB. removed entries incorrectly, result is not correct",
> >               result);
> >
> >
> >  2) I think the IdentityHashMap javadoc comment change removes too much
> >  important information.
> >
> >
> >  3) I don't understand why you make arbitrary whitespace changes - such
> >  as rewriting:
> >
> >  public class IdentityHashMap<K, V> extends AbstractMap<K, V> implements
> >         Map<K, V>, Serializable, Cloneable {
> >
> >  to be longer than 80-characters.
> >
> >
> >  4) You rename DEFAULT_MAX_SIZE to DEFAULT_SIZE but I don't really see
> >  the relevance of this to the changes.
> >
> >
> >  5) You seem to remove a number of correct comments from the
> >  declarations of threshold, DEFAULT_MAX_SIZE, loadFactor, and modCount.
> >
> >  I almost I gave up reading the patch at this point out of frustration
> >  at the number of irrelevant changes.  Such as:
> >
> >  a) the re-ordering of often unchanged public methods - previously the
> >  public methods were grouped by similar functionality which seems logical
> >  to me but now they are in alphabetical order.  This makes the patch much
> >  bigger and harder to read.  If there is a good reason to re-order the
> >  methods, do it in a separate patch.
> >
> >
> >  b) The arbitrary changes to javadoc like:
> >
> >      /**
> >  -     * Searches this Map for the specified value.
> >  -     *
> >  -     *
> >  +     * Searches this IdentityHashMap for the specified value.
> >  +     *
> >       * @param value
> >       *            the object to search for
> >  -     * @return true if <code>value</code> is a value of this Map, false
> >  +     * @return true if <code>value</code> is a value of this IdentityHash
> Map,
> >  false
> >       *         otherwise
> >       */
> >
> >  c) The arbitrary renaming of IdentityHashMapEntry to Entry.
> >
> >  Is it unrealistic to expect clean patches with justifications for
> >  changes?  Can we optimise patches as well as code?
> >
> >  If I apply the patch and then revert the whitespace changes,
> >  javadoc reformatting, renaming of DEFAULT_MAX_SIZE, renaming of
> >  IdentityHashMapEntry, etc. the the resulting svn diff contains at least
> >  33% fewer +/- lines than the one on the JIRA and is still functionally
> >  the same.
> >
> >  As someone who tries to read svn commit messages (and all committers or
> >  would-be-committers should be doing this to some extent!) patches like
> >  this are a nightmare.
> >
> >  Regards,
> >  -Mark.
> >
> >
> >
> >  >
> >  > Thanks,
> >  > Aleksey.
> >  >
> >  > [1] https://issues.apache.org/jira/browse/HARMONY-5771
> >  >
> >  > On Tue, Apr 22, 2008 at 12:09 PM, Tim Ellison <t....@gmail.com> wr
> ote:
> >  > > Aleksey Shipilev wrote:
> >  > >
> >  > > > Right, Tim. It was just the proof-of-concept "Does IdentityHashMap
> >  > > > tuning worth it?".
> >  > > >
> >  > >
> >  > >  I think it is unhelpful to mix actual, concrete suggestions for perfo
> rmanc
> >  > e
> >  > > improvements, with proof-of-concept ideas that break the specification
>  or
> >  > > security.
> >  > >
> >  > >  Please can people mark clearly in their mail whether they are suggest
> ing
> >  > > the idea only as "a thought experiment", otherwise somebody may apply 
> the
> >  > > change without properly understanding the consequences.
> >  > >
> >  > >  I realize the onus is on the committer to ensure it is safe but hidin
> g suc
> >  > h
> >  > > problems is, as I said, unhelpful.
> >  > >
> >  > >  Thanks,
> >  > >  Tim
> >  > >
> >  > >
> >  > >
> >  > >
> >  > >
> >  > > > On Mon, Apr 21, 2008 at 4:50 PM, Tim Ellison <t....@gmail.com>
> >  > > wrote:
> >  > > >
> >  > > > > Sergey Salishev wrote:
> >  > > > >
> >  > > > >
> >  > > > > > It also should be noticed that replacing the IdentityHashMap wit
> h
> >  > > HashMap
> >  > > > > >
> >  > > > > in
> >  > > > >
> >  > > > > > Thread impl gives 4.5x boost on ThreadLocalBench.
> >  > > > > >
> >  > > > > >
> >  > > > >  But it would be wrong since it opens a security vulnerability.
> >  > > > >
> >  > > > >  See:
> >  > > > >  http://markmail.org/message/rwq3kif5n33wp65m
> >  > > > >  http://svn.apache.org/viewvc?view=rev&revision=477355
> >  > > > >
> >  > > > >
> >  > > > >  Regards,
> >  > > > >  Tim
> >  > > > >
> >  > > > >
> >  > > >
> >  > > >
> >  > >
> >  >
> >
> >
> >
> 



Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Aleksey Shipilev <al...@gmail.com>.
Good comments, Mark, thanks! But I have to chill your fury a little :)

The reason behind all that changes is that entire IdentityHashMap
implementation was thrown away and replaced by HashMap, then the
relevant changes that transform it back to IdentityHashMap were done
(changing hashCode() to System.identityHashCode(), equals to ==, etc).
So if you diff the HashMap.java and IdentityHashMap.java after the
patch applied, you should notice very small difference. That's a good
sign because thus we can easily identify and generalize the common
code in both implementations, as we suggested to do. That's the
generalization we talked about with Nathan and Tim, right?

Keeping this in mind, you can see that the assumption of "minimal
change" can't be preserved and reading out the patch itself is pretty
useless. Rather, it makes sense to apply the patch to local copy and
then see whether the differences between IHM and HM are enough,
whether resulting IHM implementation contradicts with spec, whether
IHM passes the tests and runs the workloads well and stuff.

Of course, to avoid these confusions I can rewrite entire Collections
at once and then contribute a mega-patch, but I don't think this is
the proper way of doing things. I prefer gradual updates instead.

Thanks,
Aleksey.

On Tue, Apr 22, 2008 at 9:02 PM, Mark Hindess
<ma...@googlemail.com> wrote:
>
>  On 22 April 2008 at 15:23, "Aleksey Shipilev" <al...@gmail.com>
>  wrote:
>
> > Tim,
>  >
>  > I support your proposal and try to make distinction between
>  > proof-of-concept and clean patches.
>  >
>  > Performance improvements could be notorious work, requiring a lot of
>  > resources. The bad thing is, they could give nothing after the work is
>  > done, that's why we are doing thought experiments and proof-of-concept
>  > patches before actually implementing something. Proof-of-concept may
>  > also give a hint on what maximum improvement we can get and set the
>  > expectations for clean patch.
>  >
>  > So we are not looking for hiding the problems, rather we're exploring
>  > what could it worth.
>  >
>  > F.ex. the experiments on IdentityHashMap performance finally led to
>  > clean, non-breaking, consistent implementation, ready to be committed
>  > patch [1]. Can we focus on reviewing it?
>
>  "clean [snip] patch"?  Okay, I'll take the bait...
>
>  1) It is good to fix the assert in the test case, but if you are going to
>  change it at all you might as well replace:
>
>   assertTrue("TestB. removed entries incorrectly", map.size() == 11 &&
>  !result);
>
>  with:
>
>   assertEquals("TestB. removed entries incorrectly, size is not correct",
>                11, map.size());
>   assertFalse("TestB. removed entries incorrectly, result is not correct",
>               result);
>
>
>  2) I think the IdentityHashMap javadoc comment change removes too much
>  important information.
>
>
>  3) I don't understand why you make arbitrary whitespace changes - such
>  as rewriting:
>
>  public class IdentityHashMap<K, V> extends AbstractMap<K, V> implements
>         Map<K, V>, Serializable, Cloneable {
>
>  to be longer than 80-characters.
>
>
>  4) You rename DEFAULT_MAX_SIZE to DEFAULT_SIZE but I don't really see
>  the relevance of this to the changes.
>
>
>  5) You seem to remove a number of correct comments from the
>  declarations of threshold, DEFAULT_MAX_SIZE, loadFactor, and modCount.
>
>  I almost I gave up reading the patch at this point out of frustration
>  at the number of irrelevant changes.  Such as:
>
>  a) the re-ordering of often unchanged public methods - previously the
>  public methods were grouped by similar functionality which seems logical
>  to me but now they are in alphabetical order.  This makes the patch much
>  bigger and harder to read.  If there is a good reason to re-order the
>  methods, do it in a separate patch.
>
>
>  b) The arbitrary changes to javadoc like:
>
>      /**
>  -     * Searches this Map for the specified value.
>  -     *
>  -     *
>  +     * Searches this IdentityHashMap for the specified value.
>  +     *
>       * @param value
>       *            the object to search for
>  -     * @return true if <code>value</code> is a value of this Map, false
>  +     * @return true if <code>value</code> is a value of this IdentityHashMap,
>  false
>       *         otherwise
>       */
>
>  c) The arbitrary renaming of IdentityHashMapEntry to Entry.
>
>  Is it unrealistic to expect clean patches with justifications for
>  changes?  Can we optimise patches as well as code?
>
>  If I apply the patch and then revert the whitespace changes,
>  javadoc reformatting, renaming of DEFAULT_MAX_SIZE, renaming of
>  IdentityHashMapEntry, etc. the the resulting svn diff contains at least
>  33% fewer +/- lines than the one on the JIRA and is still functionally
>  the same.
>
>  As someone who tries to read svn commit messages (and all committers or
>  would-be-committers should be doing this to some extent!) patches like
>  this are a nightmare.
>
>  Regards,
>  -Mark.
>
>
>
>  >
>  > Thanks,
>  > Aleksey.
>  >
>  > [1] https://issues.apache.org/jira/browse/HARMONY-5771
>  >
>  > On Tue, Apr 22, 2008 at 12:09 PM, Tim Ellison <t....@gmail.com> wrote:
>  > > Aleksey Shipilev wrote:
>  > >
>  > > > Right, Tim. It was just the proof-of-concept "Does IdentityHashMap
>  > > > tuning worth it?".
>  > > >
>  > >
>  > >  I think it is unhelpful to mix actual, concrete suggestions for performanc
>  > e
>  > > improvements, with proof-of-concept ideas that break the specification or
>  > > security.
>  > >
>  > >  Please can people mark clearly in their mail whether they are suggesting
>  > > the idea only as "a thought experiment", otherwise somebody may apply the
>  > > change without properly understanding the consequences.
>  > >
>  > >  I realize the onus is on the committer to ensure it is safe but hiding suc
>  > h
>  > > problems is, as I said, unhelpful.
>  > >
>  > >  Thanks,
>  > >  Tim
>  > >
>  > >
>  > >
>  > >
>  > >
>  > > > On Mon, Apr 21, 2008 at 4:50 PM, Tim Ellison <t....@gmail.com>
>  > > wrote:
>  > > >
>  > > > > Sergey Salishev wrote:
>  > > > >
>  > > > >
>  > > > > > It also should be noticed that replacing the IdentityHashMap with
>  > > HashMap
>  > > > > >
>  > > > > in
>  > > > >
>  > > > > > Thread impl gives 4.5x boost on ThreadLocalBench.
>  > > > > >
>  > > > > >
>  > > > >  But it would be wrong since it opens a security vulnerability.
>  > > > >
>  > > > >  See:
>  > > > >  http://markmail.org/message/rwq3kif5n33wp65m
>  > > > >  http://svn.apache.org/viewvc?view=rev&revision=477355
>  > > > >
>  > > > >
>  > > > >  Regards,
>  > > > >  Tim
>  > > > >
>  > > > >
>  > > >
>  > > >
>  > >
>  >
>
>
>

Re: [classlib][luni][performance] IdentityHashMap implementation

Posted by Endre Stølsvik <En...@stolsvik.com>.
> a) the re-ordering of often unchanged public methods - previously the
> public methods were grouped by similar functionality which seems logical
> to me but now they are in alphabetical order.  This makes the patch much
> bigger and harder to read.  If there is a good reason to re-order the
> methods, do it in a separate patch.

If it makes ANY SENSE todo such things, then you are NOT using good 
enough tools. Have your IDE run the "Code outline" in alpha order if it 
pleases you. The file itself should obviously be in functional groups.

Ordering after functional groups is _the_ way, not _a_ way.

'javadoc' also understand this: when making the html, in the first 
block, listing the methods with the first sentence, it reorders the 
methods into alpha order, while the second block with the methods with 
the full javadoc is ordered as in the class. Perfect logic.

Alphabetical ordering of code - I'm getting aggressive thoughts by the 
mere idea. Such evilness must be fought.

Endre.