You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Simon Willnauer <si...@googlemail.com> on 2011/06/30 14:55:51 UTC

Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

hmm are you concerned about the extra Math.min that happens in the
copyOf method?
I don't how that relates to "intrinsic" and java 1.7

I don't have strong feelings here just checking if you mix something
up in the comment you put there... I am happy to keep the old and now
current code

simon

On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
> Author: rmuir
> Date: Thu Jun 30 12:42:17 2011
> New Revision: 1141510
>
> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
> Log:
> LUCENE-3239: remove use of slow Arrays.copyOf
>
> Modified:
>    lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>
> Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
> URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510&r1=1141509&r2=1141510&view=diff
> ==============================================================================
> --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original)
> +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
>
>  import java.io.IOException;
>  import java.io.OutputStream;
> -import java.util.Arrays;
>
>  /**
>  * Licensed to the Apache Software Foundation (ASF) under one or more
> @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
>   }
>
>   private void grow(int newLength) {
> -    buffer = Arrays.copyOf(buffer, newLength);
> +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
> +    // buffer = Arrays.copyOf(buffer, newLength);
> +    byte[] newBuffer = new byte[newLength];
> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
> +    buffer = newBuffer;
>   }
>
>   /**
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Uwe Schindler <uw...@thetaphi.de>.
Robert, 

as noted in my other eMail, ist only slow for the generic Object[] method (as it uses j.l.reflect.Array.newInstance(Class componentType)). We are talking here about byte[], and the Arrays method is implemented with the same 3 lines of code, Simon replaced. The only difference is a Math.min() which is intrinsic (it is used, as Arrays.copyOf supports shrinking size, so the System.arrayCopy() needs upper limit to not AIOOBE).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Thursday, June 30, 2011 3:05 PM
> To: dev@lucene.apache.org; simon.willnauer@gmail.com
> Cc: commits@lucene.apache.org
> Subject: Re: svn commit: r1141510 -
> /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
> ByteArrayOutputStream.java
> 
> because on windows 32bit at least, -client is still the default on most jres out
> there.
> 
> i realize people don't care about -client, but i will police
> foo[].clone() / arrays.copyOf etc to prevent problems.
> 
> There are comments about this stuff on the relevant bug reports (oracle's
> site is down, sorry) linked to this issue.
> https://issues.apache.org/jira/browse/LUCENE-2674
> 
> Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we
> should always use arraycopy.
> 
> On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
> <si...@googlemail.com> wrote:
> > hmm are you concerned about the extra Math.min that happens in the
> > copyOf method?
> > I don't how that relates to "intrinsic" and java 1.7
> >
> > I don't have strong feelings here just checking if you mix something
> > up in the comment you put there... I am happy to keep the old and now
> > current code
> >
> > simon
> >
> > On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
> >> Author: rmuir
> >> Date: Thu Jun 30 12:42:17 2011
> >> New Revision: 1141510
> >>
> >> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
> >> Log:
> >> LUCENE-3239: remove use of slow Arrays.copyOf
> >>
> >> Modified:
> >>
> >>
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
> >> ByteArrayOutputStream.java
> >>
> >> Modified:
> >>
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
> >> ByteArrayOutputStream.java
> >> URL:
> >>
> http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/
> >>
> org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510&r
> >> 1=1141509&r2=1141510&view=diff
> >>
> ==========================================================
> ===========
> >> =========
> >> ---
> >>
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
> >> ByteArrayOutputStream.java (original)
> >> +++
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Un
> >> +++ safeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
> >> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
> >>
> >>  import java.io.IOException;
> >>  import java.io.OutputStream;
> >> -import java.util.Arrays;
> >>
> >>  /**
> >>  * Licensed to the Apache Software Foundation (ASF) under one or more
> >> @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
> >>   }
> >>
> >>   private void grow(int newLength) {
> >> -    buffer = Arrays.copyOf(buffer, newLength);
> >> +    // It actually should be: (Java 1.7, when its intrinsic on all
> >> + machines)
> >> +    // buffer = Arrays.copyOf(buffer, newLength);
> >> +    byte[] newBuffer = new byte[newLength];
> >> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
> >> +    buffer = newBuffer;
> >>   }
> >>
> >>   /**
> >>
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Simon Willnauer <si...@googlemail.com>.
Robert I agree but doesn't that apply to Arrays.copyOf(Object[],int)
only? here we use a specialized primitive version?

simon

On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir <rc...@gmail.com> wrote:
> because on windows 32bit at least, -client is still the default on
> most jres out there.
>
> i realize people don't care about -client, but i will police
> foo[].clone() / arrays.copyOf etc to prevent problems.
>
> There are comments about this stuff on the relevant bug reports
> (oracle's site is down, sorry) linked to this issue.
> https://issues.apache.org/jira/browse/LUCENE-2674
>
> Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
> think we should always use arraycopy.
>
> On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
> <si...@googlemail.com> wrote:
>> hmm are you concerned about the extra Math.min that happens in the
>> copyOf method?
>> I don't how that relates to "intrinsic" and java 1.7
>>
>> I don't have strong feelings here just checking if you mix something
>> up in the comment you put there... I am happy to keep the old and now
>> current code
>>
>> simon
>>
>> On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
>>> Author: rmuir
>>> Date: Thu Jun 30 12:42:17 2011
>>> New Revision: 1141510
>>>
>>> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
>>> Log:
>>> LUCENE-3239: remove use of slow Arrays.copyOf
>>>
>>> Modified:
>>>    lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>
>>> Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>> URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510&r1=1141509&r2=1141510&view=diff
>>> ==============================================================================
>>> --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original)
>>> +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
>>> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
>>>
>>>  import java.io.IOException;
>>>  import java.io.OutputStream;
>>> -import java.util.Arrays;
>>>
>>>  /**
>>>  * Licensed to the Apache Software Foundation (ASF) under one or more
>>> @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
>>>   }
>>>
>>>   private void grow(int newLength) {
>>> -    buffer = Arrays.copyOf(buffer, newLength);
>>> +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
>>> +    // buffer = Arrays.copyOf(buffer, newLength);
>>> +    byte[] newBuffer = new byte[newLength];
>>> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>>> +    buffer = newBuffer;
>>>   }
>>>
>>>   /**
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Simon Willnauer <si...@googlemail.com>.
On Thu, Jun 30, 2011 at 3:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> We had an issue about this with FST's array growing in Mike's code, in facts ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object (uses slow reflection).
>
> For primitives it can only get faster in later JVMs, this is why we want to change all ArrayUtils.grow() to use this (and we don’t have a generic one there for above reason).

+1 - I don't see why this would be any slower... if we can get
improvements we should go for it. The issues and bugreports are all
for non-primitive copyOf methods so I don't see how this should affect
us

simon
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf
>> Of Dawid Weiss
>> Sent: Thursday, June 30, 2011 3:11 PM
>> To: dev@lucene.apache.org
>> Subject: Re: svn commit: r1141510 -
>> /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
>> ByteArrayOutputStream.java
>>
>> Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics
>> are used it can only get faster. For object types it will probably be a bit slower
>> for -client because of a runtime check for the component type.
>>
>> Dawid
>>
>> On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir <rc...@gmail.com> wrote:
>> > because on windows 32bit at least, -client is still the default on
>> > most jres out there.
>> >
>> > i realize people don't care about -client, but i will police
>> > foo[].clone() / arrays.copyOf etc to prevent problems.
>> >
>> > There are comments about this stuff on the relevant bug reports
>> > (oracle's site is down, sorry) linked to this issue.
>> > https://issues.apache.org/jira/browse/LUCENE-2674
>> >
>> > Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
>> > think we should always use arraycopy.
>> >
>> > On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
>> > <si...@googlemail.com> wrote:
>> >> hmm are you concerned about the extra Math.min that happens in the
>> >> copyOf method?
>> >> I don't how that relates to "intrinsic" and java 1.7
>> >>
>> >> I don't have strong feelings here just checking if you mix something
>> >> up in the comment you put there... I am happy to keep the old and now
>> >> current code
>> >>
>> >> simon
>> >>
>> >> On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
>> >>> Author: rmuir
>> >>> Date: Thu Jun 30 12:42:17 2011
>> >>> New Revision: 1141510
>> >>>
>> >>> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
>> >>> Log:
>> >>> LUCENE-3239: remove use of slow Arrays.copyOf
>> >>>
>> >>> Modified:
>> >>>
>> >>>
>> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
>> >>> eByteArrayOutputStream.java
>> >>>
>> >>> Modified:
>> >>>
>> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
>> >>> eByteArrayOutputStream.java
>> >>> URL:
>> >>>
>> http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java
>> >>>
>> /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510
>> >>> &r1=1141509&r2=1141510&view=diff
>> >>>
>> ==========================================================
>> ==========
>> >>> ==========
>> >>> ---
>> >>>
>> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
>> >>> eByteArrayOutputStream.java (original)
>> >>> +++
>> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U
>> >>> +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
>> >>> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
>> >>>
>> >>>  import java.io.IOException;
>> >>>  import java.io.OutputStream;
>> >>> -import java.util.Arrays;
>> >>>
>> >>>  /**
>> >>>  * Licensed to the Apache Software Foundation (ASF) under one or
>> >>> more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
>> >>>   }
>> >>>
>> >>>   private void grow(int newLength) {
>> >>> -    buffer = Arrays.copyOf(buffer, newLength);
>> >>> +    // It actually should be: (Java 1.7, when its intrinsic on all
>> >>> + machines)
>> >>> +    // buffer = Arrays.copyOf(buffer, newLength);
>> >>> +    byte[] newBuffer = new byte[newLength];
>> >>> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>> >>> +    buffer = newBuffer;
>> >>>   }
>> >>>
>> >>>   /**
>> >>>
>> >>>
>> >>>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >> additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> > additional commands, e-mail: dev-help@lucene.apache.org
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Uwe Schindler <uw...@thetaphi.de>.
We had an issue about this with FST's array growing in Mike's code, in facts ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object (uses slow reflection).

For primitives it can only get faster in later JVMs, this is why we want to change all ArrayUtils.grow() to use this (and we don’t have a generic one there for above reason).

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf
> Of Dawid Weiss
> Sent: Thursday, June 30, 2011 3:11 PM
> To: dev@lucene.apache.org
> Subject: Re: svn commit: r1141510 -
> /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
> ByteArrayOutputStream.java
> 
> Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics
> are used it can only get faster. For object types it will probably be a bit slower
> for -client because of a runtime check for the component type.
> 
> Dawid
> 
> On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir <rc...@gmail.com> wrote:
> > because on windows 32bit at least, -client is still the default on
> > most jres out there.
> >
> > i realize people don't care about -client, but i will police
> > foo[].clone() / arrays.copyOf etc to prevent problems.
> >
> > There are comments about this stuff on the relevant bug reports
> > (oracle's site is down, sorry) linked to this issue.
> > https://issues.apache.org/jira/browse/LUCENE-2674
> >
> > Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
> > think we should always use arraycopy.
> >
> > On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
> > <si...@googlemail.com> wrote:
> >> hmm are you concerned about the extra Math.min that happens in the
> >> copyOf method?
> >> I don't how that relates to "intrinsic" and java 1.7
> >>
> >> I don't have strong feelings here just checking if you mix something
> >> up in the comment you put there... I am happy to keep the old and now
> >> current code
> >>
> >> simon
> >>
> >> On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
> >>> Author: rmuir
> >>> Date: Thu Jun 30 12:42:17 2011
> >>> New Revision: 1141510
> >>>
> >>> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
> >>> Log:
> >>> LUCENE-3239: remove use of slow Arrays.copyOf
> >>>
> >>> Modified:
> >>>
> >>>
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
> >>> eByteArrayOutputStream.java
> >>>
> >>> Modified:
> >>>
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
> >>> eByteArrayOutputStream.java
> >>> URL:
> >>>
> http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java
> >>>
> /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510
> >>> &r1=1141509&r2=1141510&view=diff
> >>>
> ==========================================================
> ==========
> >>> ==========
> >>> ---
> >>>
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf
> >>> eByteArrayOutputStream.java (original)
> >>> +++
> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U
> >>> +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
> >>> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
> >>>
> >>>  import java.io.IOException;
> >>>  import java.io.OutputStream;
> >>> -import java.util.Arrays;
> >>>
> >>>  /**
> >>>  * Licensed to the Apache Software Foundation (ASF) under one or
> >>> more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
> >>>   }
> >>>
> >>>   private void grow(int newLength) {
> >>> -    buffer = Arrays.copyOf(buffer, newLength);
> >>> +    // It actually should be: (Java 1.7, when its intrinsic on all
> >>> + machines)
> >>> +    // buffer = Arrays.copyOf(buffer, newLength);
> >>> +    byte[] newBuffer = new byte[newLength];
> >>> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
> >>> +    buffer = newBuffer;
> >>>   }
> >>>
> >>>   /**
> >>>
> >>>
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Michael McCandless <lu...@mikemccandless.com>.
Fair enough, and I agree.

Though the least we could do is rotate in a Windows env, where Java
runs with -client, to our Jenkins.

But simple-to-follow rules like "Don't use Arrays.copyOf; use
System.arraycopy instead" (if indeed System.arraycopy seems to
generally not be slower) seem like a no-brainer.

Why risk Arrays.copyOf, anytime?  Shouldn't we never use it...?

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jun 30, 2011 at 3:09 PM, Dawid Weiss
<da...@cs.put.poznan.pl> wrote:
>> I think it's important Lucene keeps good performance on "ordinary"
>> machines/envs.
>
> Not that this voice will help in anything, but I think the above is
> virtually impossible to achieve unless you have a bunch of machines,
> OSs and VMs to continually test on and a consistent set of benchmarks
> plotted over time... and of course check every single commit for
> regression over all these combinations. And even then you'd always
> find a case of something being faster or slower on some combination of
> hardware/ software; optimizing for these differences makes little
> sense to me (people struggling with performance on some weird
> software/hardware combination can always change the VM vendor or a VM
> switch).
>
> Sorry for being so pessimistically unconstructive... :(
>
> Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
> I think it's important Lucene keeps good performance on "ordinary"
> machines/envs.

Not that this voice will help in anything, but I think the above is
virtually impossible to achieve unless you have a bunch of machines,
OSs and VMs to continually test on and a consistent set of benchmarks
plotted over time... and of course check every single commit for
regression over all these combinations. And even then you'd always
find a case of something being faster or slower on some combination of
hardware/ software; optimizing for these differences makes little
sense to me (people struggling with performance on some weird
software/hardware combination can always change the VM vendor or a VM
switch).

Sorry for being so pessimistically unconstructive... :(

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Michael McCandless <lu...@mikemccandless.com>.
I think it's important Lucene keeps good performance on "ordinary"
machines/envs.

It's really quite dangerous that the active Lucene devs all use beasts
for development/testing.  We draw false conclusions.

So we really should be testing with -client and if indeed generified
Arrays.copyOf (and anything else) is risky in such envs we should not
use it when System.arraycopy works more consistently.

Mike McCandless

http://blog.mikemccandless.com

On Thu, Jun 30, 2011 at 2:50 PM, Dawid Weiss
<da...@cs.put.poznan.pl> wrote:
>> I don't seen any evidence that this is any slower though.
>
> You need to run with -client (if the machine is a beast this is tricky
> because x64 will pick -server regardless of the command-line setting)
> and you need to be copying generic arrays. I think this can be shown
> -- a caliper benchmark would be perfect to demonstrate this in
> isolation; I may write one if I find a spare moment.
>
> Dawid
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Fri, Jul 1, 2011 at 1:47 AM, Simon Willnauer
<si...@googlemail.com> wrote:

> Mike I think we should do that but the real issue here is what if
> somebody comes up with any arbitrary method in the future claiming its
> slow we back out and use the "we think safe way" what if it is
> actually the other way around and copyOf is optimized by new VMs and
> the copyarray is slightly slower.

I think we take it case by case?  We do need to be careful when using
low level ops in Java.

In this specific case, we have been almost burned already by
Arrays.copyOf, and never by System.arraycopy (that I'm aware of), so
we should not cutover until we have some confidence that burning will
not occur.  Other cases will be different, but this one has enough
history that the bias seems clear.

> I just want
> to prevent the "we should not do this" it might be a problem in the
> big picture while the microbenchmark doesn't show a difference. At
> some point we have to rely on the JVM.

I agree, but generally the burden of proof is on the "new one".  Just
because we can use Java 1.6 code now doesn't mean we should blindly
cutover to new stuff.

System.arraycopy is tried & true and we've never hit a perf issue with
it, in my memory.

> Even if we benchmark on all OS with various JVMs we can't prevent jvm
> bug based perf hits.

Right, nothing is perfect here; this is just about mitigating risk.
We were lucky to have tracked down this slowdown down last time, I
think only because Robert was using a -client JVM at the time?

> While there was no bug reported for primitives
> here we don't have to be afraid of it either. I don't think its saver
> to use arraycopy at all its just a level of indirection safer but
> makes development more of a pain IMO.

Yes it's a slight hassle (2 extra lines), but if this mitigates risk,
it's worth it.

That said, if we can convince ourselves that in the primitives only
case, Arrays.copyOf has very little risk, then I'm OK w/ using it for
those cases.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Shai Erera <se...@gmail.com>.
About the encoders package - there are several encoders there besides
VInt, so I wouldn't dispose of it so quickly. That said, I think we
should definitely explore consolidating VInt with the core classes,
and maybe write an encoder which delegate to them.

Or, come up w/ a different approach for allowing to plug in different
Encoders. I don't rule out anything, as long as we preserve
functionality and capabilities.

Shai

On Friday, July 1, 2011, Michael McCandless <lu...@mikemccandless.com> wrote:
> On Fri, Jul 1, 2011 at 2:33 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> Hi,
>>
>> I don't understand the whole discussion here, so please compare these two implementations and tell me which one is faster. Please don't hurt me, if you don't want to see src.jar code from OpenJDK Java6 - just delete this mail if you don’t want to (the code here is licensed under GPL):
>
> This is the source code for a specific version of one specific Java
> impl.  If we knew all Java impls simply implemented the primitive case
> using System.arraycopy (admittedly it's hard to imagine that they
> wouldn't!) then we are fine.
>
>> This is our implementation, simon replaced and Robert reverted (UnsafeByteArrayOutputStream):
>>
>>  private void grow(int newLength) {
>>    // It actually should be: (Java 1.7, when its intrinsic on all machines)
>>    // buffer = Arrays.copyOf(buffer, newLength);
>>    byte[] newBuffer = new byte[newLength];
>>    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>>    buffer = newBuffer;
>>  }
>>
>> So please look at the code, where is a difference that could slow down, except the Math.min() call which is an intrinsic in almost every JDK on earth?
>
> Right, in this case (if you used OpenJDK 6) we are obviously OK.  Not
> sure about other cases...
>
>> The problem we are talking about here is only about the generic Object[] copyOf method and also affects e.g. *all* Collection.toArray() methods - they all use this code, so whenever you use ArrayList.toArray() or similar, the slow code is executed. This is why we replaced Collections.sort() by CollectionUtil.sort, that does no array copy. Simon & me were not willing to replace the reallocations in FST code (Mike you remember, we reverted that on your GIT repo when we did perf tests) and other parts in Lucene (there are only few of them). The idea was only to replace primitive type code to make it easier readable. And with later JDK code it could even get faster (not slower), if Oracle starts to add intrinsics for those new methods (and that’s Dawid and mine reason to change to copyTo for primitive types). In general, if you use Java SDK methods, that are as fast as ours, they always have a chance to get faster in later JDKs. So we should always prefer Java SDK methods, unless they are slower because their default impl is too generic or has too much safety checks or uses reflection.
>
> OK I'm convinced (I think!) that for primitive types only, let's use
> Arrays.copyOf!
>
>> To come back to UnsafeByteArrayOutputStream:
>>
>> I would change the whole code, as I don’t like the allocation strategy in it (it's exponential, on every grow it doubles its size). We should change that to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar allocation strategy like in trunk. Then we can discuss about this problem again when Simon & me wants to change ArrayUtils.grow methods to use Arrays.copyOf... *g* [just joking, I will never ask again, because this discussion here is endless and does not bring us forward].
>
> Well, it sounds like for primitive types, we can cutover
> ArrayUtils.grow methods.  Then we can look @ the nightly bench the
> next day ;)
>
> But I agree we should fix UnsafeByteArrayOutputStream... or, isn't it
> (almost) a dup of ByteArrayDataOutput?
>
>> The other thing I don’t like in the new faceting module is duplication of vint code. Why don’t we change it to use DataInput/DataOutput and use Dawid's new In/OutStream wrapper for DataOutput everywhere. This would be much more streamlined with all the code we currently have. Then we can encode the payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, wrapped with a OutputStreamDataOutput wrapper? Or maybe add a ByteArrayDataOutput class.
>
> That sounds good!
>
> Uwe can you commit TODOs to the code w/ these ideas?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Fri, Jul 1, 2011 at 2:33 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi,
>
> I don't understand the whole discussion here, so please compare these two implementations and tell me which one is faster. Please don't hurt me, if you don't want to see src.jar code from OpenJDK Java6 - just delete this mail if you don’t want to (the code here is licensed under GPL):

This is the source code for a specific version of one specific Java
impl.  If we knew all Java impls simply implemented the primitive case
using System.arraycopy (admittedly it's hard to imagine that they
wouldn't!) then we are fine.

> This is our implementation, simon replaced and Robert reverted (UnsafeByteArrayOutputStream):
>
>  private void grow(int newLength) {
>    // It actually should be: (Java 1.7, when its intrinsic on all machines)
>    // buffer = Arrays.copyOf(buffer, newLength);
>    byte[] newBuffer = new byte[newLength];
>    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>    buffer = newBuffer;
>  }
>
> So please look at the code, where is a difference that could slow down, except the Math.min() call which is an intrinsic in almost every JDK on earth?

Right, in this case (if you used OpenJDK 6) we are obviously OK.  Not
sure about other cases...

> The problem we are talking about here is only about the generic Object[] copyOf method and also affects e.g. *all* Collection.toArray() methods - they all use this code, so whenever you use ArrayList.toArray() or similar, the slow code is executed. This is why we replaced Collections.sort() by CollectionUtil.sort, that does no array copy. Simon & me were not willing to replace the reallocations in FST code (Mike you remember, we reverted that on your GIT repo when we did perf tests) and other parts in Lucene (there are only few of them). The idea was only to replace primitive type code to make it easier readable. And with later JDK code it could even get faster (not slower), if Oracle starts to add intrinsics for those new methods (and that’s Dawid and mine reason to change to copyTo for primitive types). In general, if you use Java SDK methods, that are as fast as ours, they always have a chance to get faster in later JDKs. So we should always prefer Java SDK methods, unless they are slower because their default impl is too generic or has too much safety checks or uses reflection.

OK I'm convinced (I think!) that for primitive types only, let's use
Arrays.copyOf!

> To come back to UnsafeByteArrayOutputStream:
>
> I would change the whole code, as I don’t like the allocation strategy in it (it's exponential, on every grow it doubles its size). We should change that to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar allocation strategy like in trunk. Then we can discuss about this problem again when Simon & me wants to change ArrayUtils.grow methods to use Arrays.copyOf... *g* [just joking, I will never ask again, because this discussion here is endless and does not bring us forward].

Well, it sounds like for primitive types, we can cutover
ArrayUtils.grow methods.  Then we can look @ the nightly bench the
next day ;)

But I agree we should fix UnsafeByteArrayOutputStream... or, isn't it
(almost) a dup of ByteArrayDataOutput?

> The other thing I don’t like in the new faceting module is duplication of vint code. Why don’t we change it to use DataInput/DataOutput and use Dawid's new In/OutStream wrapper for DataOutput everywhere. This would be much more streamlined with all the code we currently have. Then we can encode the payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, wrapped with a OutputStreamDataOutput wrapper? Or maybe add a ByteArrayDataOutput class.

That sounds good!

Uwe can you commit TODOs to the code w/ these ideas?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

I don't understand the whole discussion here, so please compare these two implementations and tell me which one is faster. Please don't hurt me, if you don't want to see src.jar code from OpenJDK Java6 - just delete this mail if you don’t want to (the code here is licensed under GPL):

Arrays.java:
    /**
     * Copies the specified array, truncating or padding with zeros (if necessary)
     * so the copy has the specified length.  For all indices that are
     * valid in both the original array and the copy, the two arrays will
     * contain identical values.  For any indices that are valid in the
     * copy but not the original, the copy will contain <tt>(byte)0</tt>.
     * Such indices will exist if and only if the specified length
     * is greater than that of the original array.
     *
     * @param original the array to be copied
     * @param newLength the length of the copy to be returned
     * @return a copy of the original array, truncated or padded with zeros
     *     to obtain the specified length
     * @throws NegativeArraySizeException if <tt>newLength</tt> is negative
     * @throws NullPointerException if <tt>original</tt> is null
     * @since 1.6
     */
    public static byte[] copyOf(byte[] original, int newLength) {
        byte[] copy = new byte[newLength];
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }


This is our implementation, simon replaced and Robert reverted (UnsafeByteArrayOutputStream):

  private void grow(int newLength) {
    // It actually should be: (Java 1.7, when its intrinsic on all machines)
    // buffer = Arrays.copyOf(buffer, newLength);
    byte[] newBuffer = new byte[newLength];
    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
    buffer = newBuffer;
  }

So please look at the code, where is a difference that could slow down, except the Math.min() call which is an intrinsic in almost every JDK on earth?

The problem we are talking about here is only about the generic Object[] copyOf method and also affects e.g. *all* Collection.toArray() methods - they all use this code, so whenever you use ArrayList.toArray() or similar, the slow code is executed. This is why we replaced Collections.sort() by CollectionUtil.sort, that does no array copy. Simon & me were not willing to replace the reallocations in FST code (Mike you remember, we reverted that on your GIT repo when we did perf tests) and other parts in Lucene (there are only few of them). The idea was only to replace primitive type code to make it easier readable. And with later JDK code it could even get faster (not slower), if Oracle starts to add intrinsics for those new methods (and that’s Dawid and mine reason to change to copyTo for primitive types). In general, if you use Java SDK methods, that are as fast as ours, they always have a chance to get faster in later JDKs. So we should always prefer Java SDK methods, unless they are slower because their default impl is too generic or has too much safety checks or uses reflection.


To come back to UnsafeByteArrayOutputStream:

I would change the whole code, as I don’t like the allocation strategy in it (it's exponential, on every grow it doubles its size). We should change that to use ArrayUtils.grow() and ArrayUtils.oversize(), to have a similar allocation strategy like in trunk. Then we can discuss about this problem again when Simon & me wants to change ArrayUtils.grow methods to use Arrays.copyOf... *g* [just joking, I will never ask again, because this discussion here is endless and does not bring us forward].

The other thing I don’t like in the new faceting module is duplication of vint code. Why don’t we change it to use DataInput/DataOutput and use Dawid's new In/OutStream wrapper for DataOutput everywhere. This would be much more streamlined with all the code we currently have. Then we can encode the payloads (or later docvalues) using the new UnsafeByteArrayOutputStream, wrapped with a OutputStreamDataOutput wrapper? Or maybe add a ByteArrayDataOutput class.

Uwe
(getting crazy)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
> Sent: Friday, July 01, 2011 7:47 AM
> To: Michael McCandless
> Cc: dev@lucene.apache.org; Dawid Weiss
> Subject: Re: svn commit: r1141510 -
> /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe
> ByteArrayOutputStream.java
> 
> On Fri, Jul 1, 2011 at 12:19 AM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
> > On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer
> > <si...@googlemail.com> wrote:
> >> On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
> >> <da...@cs.put.poznan.pl> wrote:
> >>>> I don't seen any evidence that this is any slower though.
> >>>
> >>> You need to run with -client (if the machine is a beast this is
> >>> tricky because x64 will pick -server regardless of the command-line
> >>> setting) and you need to be copying generic arrays. I think this can
> >>> be shown
> >>> -- a caliper benchmark would be perfect to demonstrate this in
> >>> isolation; I may write one if I find a spare moment.
> >>
> >> this is what I want to see. I don't want to discuss based on some bug
> >> reported for a non-primitive version of copyOf thats all.
> >> its pointless to discuss if there is no evidence which I don't see. I
> >> am happy with arraycopy I would just have appreciated a discussion
> >> before backing the change out.
> >
> > I think the burden of proof here is on Arrays.copyOf.
> >
> > Ie, until we can prove (through benchmarking in different envs) that
> > it can be trusted, we should just stick with System.arraycopy to
> > reduce the risk.
> 
> Mike I think we should do that but the real issue here is what if somebody
> comes up with any arbitrary method in the future claiming its slow we back
> out and use the "we think safe way" what if it is actually the other way
> around and copyOf is optimized by new VMs and the copyarray is slightly
> slower. If robert would not have reverted this commit we would have not
> discussed that her at all. I just want to prevent the "we should not do this" it
> might be a problem in the big picture while the microbenchmark doesn't
> show a difference. At some point we have to rely on the JVM.
> Even if we benchmark on all OS with various JVMs we can't prevent jvm bug
> based perf hits. While there was no bug reported for primitives here we
> don't have to be afraid of it either. I don't think its saver to use arraycopy at
> all its just a level of indirection safer but makes development more of a pain
> IMO.
> 
> Just my $0.05
> >
> > Mike
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Simon Willnauer <si...@googlemail.com>.
On Fri, Jul 1, 2011 at 12:19 AM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer
> <si...@googlemail.com> wrote:
>> On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
>> <da...@cs.put.poznan.pl> wrote:
>>>> I don't seen any evidence that this is any slower though.
>>>
>>> You need to run with -client (if the machine is a beast this is tricky
>>> because x64 will pick -server regardless of the command-line setting)
>>> and you need to be copying generic arrays. I think this can be shown
>>> -- a caliper benchmark would be perfect to demonstrate this in
>>> isolation; I may write one if I find a spare moment.
>>
>> this is what I want to see. I don't want to discuss based on some bug
>> reported for a non-primitive version of copyOf thats all.
>> its pointless to discuss if there is no evidence which I don't see. I
>> am happy with arraycopy I would just have appreciated a discussion
>> before backing the change out.
>
> I think the burden of proof here is on Arrays.copyOf.
>
> Ie, until we can prove (through benchmarking in different envs) that
> it can be trusted, we should just stick with System.arraycopy to
> reduce the risk.

Mike I think we should do that but the real issue here is what if
somebody comes up with any arbitrary method in the future claiming its
slow we back out and use the "we think safe way" what if it is
actually the other way around and copyOf is optimized by new VMs and
the copyarray is slightly slower. If robert would not have reverted
this commit we would have not discussed that her at all. I just want
to prevent the "we should not do this" it might be a problem in the
big picture while the microbenchmark doesn't show a difference. At
some point we have to rely on the JVM.
Even if we benchmark on all OS with various JVMs we can't prevent jvm
bug based perf hits. While there was no bug reported for primitives
here we don't have to be afraid of it either. I don't think its saver
to use arraycopy at all its just a level of indirection safer but
makes development more of a pain IMO.

Just my $0.05
>
> Mike
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer
<si...@googlemail.com> wrote:
> On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
> <da...@cs.put.poznan.pl> wrote:
>>> I don't seen any evidence that this is any slower though.
>>
>> You need to run with -client (if the machine is a beast this is tricky
>> because x64 will pick -server regardless of the command-line setting)
>> and you need to be copying generic arrays. I think this can be shown
>> -- a caliper benchmark would be perfect to demonstrate this in
>> isolation; I may write one if I find a spare moment.
>
> this is what I want to see. I don't want to discuss based on some bug
> reported for a non-primitive version of copyOf thats all.
> its pointless to discuss if there is no evidence which I don't see. I
> am happy with arraycopy I would just have appreciated a discussion
> before backing the change out.

I think the burden of proof here is on Arrays.copyOf.

Ie, until we can prove (through benchmarking in different envs) that
it can be trusted, we should just stick with System.arraycopy to
reduce the risk.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Simon Willnauer <si...@googlemail.com>.
On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss
<da...@cs.put.poznan.pl> wrote:
>> I don't seen any evidence that this is any slower though.
>
> You need to run with -client (if the machine is a beast this is tricky
> because x64 will pick -server regardless of the command-line setting)
> and you need to be copying generic arrays. I think this can be shown
> -- a caliper benchmark would be perfect to demonstrate this in
> isolation; I may write one if I find a spare moment.

this is what I want to see. I don't want to discuss based on some bug
reported for a non-primitive version of copyOf thats all.
its pointless to discuss if there is no evidence which I don't see. I
am happy with arraycopy I would just have appreciated a discussion
before backing the change out.

simon
>
> Dawid
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
> I don't seen any evidence that this is any slower though.

You need to run with -client (if the machine is a beast this is tricky
because x64 will pick -server regardless of the command-line setting)
and you need to be copying generic arrays. I think this can be shown
-- a caliper benchmark would be perfect to demonstrate this in
isolation; I may write one if I find a spare moment.

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Simon Willnauer <si...@googlemail.com>.
On Thu, Jun 30, 2011 at 4:44 PM, Robert Muir <rc...@gmail.com> wrote:
> I think Dawid is correct here... so we should change this back? still
> personally when I see array clone() or copyOf() it makes me concerned, I
> know these are as fast as arraycopy in recent versions, but depending on
> which variant is used, and whether you use -server, can be slower... in
> general I just don't want us to have performance regressions on say windows
> 32bit over this stuff, personally I think arraycopy is a sure fire bet every
> time, but Ill concede the point that copyOf might not be slower for the
> primitive versions... I think in jdk7 we will not have this issue as -client
> sorta goes away in favor of the tiered thing? anyway, I think we should
> proceed with caution here as far as moving things over to copyOf, I don't
> see any evidence that its ever faster, but its definitely sometimes slower.

I don't seen any evidence that this is any slower though.

simon
>
> On Jun 30, 2011 9:12 AM, "Dawid Weiss" <da...@cs.put.poznan.pl> wrote:
>> Arrays.copyOf(primitive) is actually System.arraycopy by default. If
>> intrinsics are used it can only get faster. For object types it will
>> probably be a bit slower for -client because of a runtime check for
>> the component type.
>>
>> Dawid
>>
>> On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir <rc...@gmail.com> wrote:
>>> because on windows 32bit at least, -client is still the default on
>>> most jres out there.
>>>
>>> i realize people don't care about -client, but i will police
>>> foo[].clone() / arrays.copyOf etc to prevent problems.
>>>
>>> There are comments about this stuff on the relevant bug reports
>>> (oracle's site is down, sorry) linked to this issue.
>>> https://issues.apache.org/jira/browse/LUCENE-2674
>>>
>>> Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
>>> think we should always use arraycopy.
>>>
>>> On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
>>> <si...@googlemail.com> wrote:
>>>> hmm are you concerned about the extra Math.min that happens in the
>>>> copyOf method?
>>>> I don't how that relates to "intrinsic" and java 1.7
>>>>
>>>> I don't have strong feelings here just checking if you mix something
>>>> up in the comment you put there... I am happy to keep the old and now
>>>> current code
>>>>
>>>> simon
>>>>
>>>> On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
>>>>> Author: rmuir
>>>>> Date: Thu Jun 30 12:42:17 2011
>>>>> New Revision: 1141510
>>>>>
>>>>> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
>>>>> Log:
>>>>> LUCENE-3239: remove use of slow Arrays.copyOf
>>>>>
>>>>> Modified:
>>>>>
>>>>>  lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>>>
>>>>> Modified:
>>>>> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>>> URL:
>>>>> http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510&r1=1141509&r2=1141510&view=diff
>>>>>
>>>>> ==============================================================================
>>>>> ---
>>>>> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>>> (original)
>>>>> +++
>>>>> lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>>> Thu Jun 30 12:42:17 2011
>>>>> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
>>>>>
>>>>>  import java.io.IOException;
>>>>>  import java.io.OutputStream;
>>>>> -import java.util.Arrays;
>>>>>
>>>>>  /**
>>>>>  * Licensed to the Apache Software Foundation (ASF) under one or more
>>>>> @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
>>>>>   }
>>>>>
>>>>>   private void grow(int newLength) {
>>>>> -    buffer = Arrays.copyOf(buffer, newLength);
>>>>> +    // It actually should be: (Java 1.7, when its intrinsic on all
>>>>> machines)
>>>>> +    // buffer = Arrays.copyOf(buffer, newLength);
>>>>> +    byte[] newBuffer = new byte[newLength];
>>>>> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>>>>> +    buffer = newBuffer;
>>>>>   }
>>>>>
>>>>>   /**
>>>>>
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Robert Muir <rc...@gmail.com>.
I think Dawid is correct here... so we should change this back? still
personally when I see array clone() or copyOf() it makes me concerned, I
know these are as fast as arraycopy in recent versions, but depending on
which variant is used, and whether you use -server, can be slower... in
general I just don't want us to have performance regressions on say windows
32bit over this stuff, personally I think arraycopy is a sure fire bet every
time, but Ill concede the point that copyOf might not be slower for the
primitive versions... I think in jdk7 we will not have this issue as -client
sorta goes away in favor of the tiered thing? anyway, I think we should
proceed with caution here as far as moving things over to copyOf, I don't
see any evidence that its ever faster, but its definitely sometimes slower.
On Jun 30, 2011 9:12 AM, "Dawid Weiss" <da...@cs.put.poznan.pl> wrote:
> Arrays.copyOf(primitive) is actually System.arraycopy by default. If
> intrinsics are used it can only get faster. For object types it will
> probably be a bit slower for -client because of a runtime check for
> the component type.
>
> Dawid
>
> On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir <rc...@gmail.com> wrote:
>> because on windows 32bit at least, -client is still the default on
>> most jres out there.
>>
>> i realize people don't care about -client, but i will police
>> foo[].clone() / arrays.copyOf etc to prevent problems.
>>
>> There are comments about this stuff on the relevant bug reports
>> (oracle's site is down, sorry) linked to this issue.
>> https://issues.apache.org/jira/browse/LUCENE-2674
>>
>> Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
>> think we should always use arraycopy.
>>
>> On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
>> <si...@googlemail.com> wrote:
>>> hmm are you concerned about the extra Math.min that happens in the
>>> copyOf method?
>>> I don't how that relates to "intrinsic" and java 1.7
>>>
>>> I don't have strong feelings here just checking if you mix something
>>> up in the comment you put there... I am happy to keep the old and now
>>> current code
>>>
>>> simon
>>>
>>> On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
>>>> Author: rmuir
>>>> Date: Thu Jun 30 12:42:17 2011
>>>> New Revision: 1141510
>>>>
>>>> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
>>>> Log:
>>>> LUCENE-3239: remove use of slow Arrays.copyOf
>>>>
>>>> Modified:
>>>>
 lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>>
>>>> Modified:
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>> URL:
http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510&r1=1141509&r2=1141510&view=diff
>>>>
==============================================================================
>>>> ---
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
(original)
>>>> +++
lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Thu Jun 30 12:42:17 2011
>>>> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
>>>>
>>>>  import java.io.IOException;
>>>>  import java.io.OutputStream;
>>>> -import java.util.Arrays;
>>>>
>>>>  /**
>>>>  * Licensed to the Apache Software Foundation (ASF) under one or more
>>>> @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
>>>>   }
>>>>
>>>>   private void grow(int newLength) {
>>>> -    buffer = Arrays.copyOf(buffer, newLength);
>>>> +    // It actually should be: (Java 1.7, when its intrinsic on all
machines)
>>>> +    // buffer = Arrays.copyOf(buffer, newLength);
>>>> +    byte[] newBuffer = new byte[newLength];
>>>> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>>>> +    buffer = newBuffer;
>>>>   }
>>>>
>>>>   /**
>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
Arrays.copyOf(primitive) is actually System.arraycopy by default. If
intrinsics are used it can only get faster. For object types it will
probably be a bit slower for -client because of a runtime check for
the component type.

Dawid

On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir <rc...@gmail.com> wrote:
> because on windows 32bit at least, -client is still the default on
> most jres out there.
>
> i realize people don't care about -client, but i will police
> foo[].clone() / arrays.copyOf etc to prevent problems.
>
> There are comments about this stuff on the relevant bug reports
> (oracle's site is down, sorry) linked to this issue.
> https://issues.apache.org/jira/browse/LUCENE-2674
>
> Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
> think we should always use arraycopy.
>
> On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
> <si...@googlemail.com> wrote:
>> hmm are you concerned about the extra Math.min that happens in the
>> copyOf method?
>> I don't how that relates to "intrinsic" and java 1.7
>>
>> I don't have strong feelings here just checking if you mix something
>> up in the comment you put there... I am happy to keep the old and now
>> current code
>>
>> simon
>>
>> On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
>>> Author: rmuir
>>> Date: Thu Jun 30 12:42:17 2011
>>> New Revision: 1141510
>>>
>>> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
>>> Log:
>>> LUCENE-3239: remove use of slow Arrays.copyOf
>>>
>>> Modified:
>>>    lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>>
>>> Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>> URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510&r1=1141509&r2=1141510&view=diff
>>> ==============================================================================
>>> --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original)
>>> +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
>>> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
>>>
>>>  import java.io.IOException;
>>>  import java.io.OutputStream;
>>> -import java.util.Arrays;
>>>
>>>  /**
>>>  * Licensed to the Apache Software Foundation (ASF) under one or more
>>> @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
>>>   }
>>>
>>>   private void grow(int newLength) {
>>> -    buffer = Arrays.copyOf(buffer, newLength);
>>> +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
>>> +    // buffer = Arrays.copyOf(buffer, newLength);
>>> +    byte[] newBuffer = new byte[newLength];
>>> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>>> +    buffer = newBuffer;
>>>   }
>>>
>>>   /**
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java

Posted by Robert Muir <rc...@gmail.com>.
because on windows 32bit at least, -client is still the default on
most jres out there.

i realize people don't care about -client, but i will police
foo[].clone() / arrays.copyOf etc to prevent problems.

There are comments about this stuff on the relevant bug reports
(oracle's site is down, sorry) linked to this issue.
https://issues.apache.org/jira/browse/LUCENE-2674

Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I
think we should always use arraycopy.

On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer
<si...@googlemail.com> wrote:
> hmm are you concerned about the extra Math.min that happens in the
> copyOf method?
> I don't how that relates to "intrinsic" and java 1.7
>
> I don't have strong feelings here just checking if you mix something
> up in the comment you put there... I am happy to keep the old and now
> current code
>
> simon
>
> On Thu, Jun 30, 2011 at 2:42 PM,  <rm...@apache.org> wrote:
>> Author: rmuir
>> Date: Thu Jun 30 12:42:17 2011
>> New Revision: 1141510
>>
>> URL: http://svn.apache.org/viewvc?rev=1141510&view=rev
>> Log:
>> LUCENE-3239: remove use of slow Arrays.copyOf
>>
>> Modified:
>>    lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>>
>> Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
>> URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510&r1=1141509&r2=1141510&view=diff
>> ==============================================================================
>> --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original)
>> +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011
>> @@ -2,7 +2,6 @@ package org.apache.lucene.util;
>>
>>  import java.io.IOException;
>>  import java.io.OutputStream;
>> -import java.util.Arrays;
>>
>>  /**
>>  * Licensed to the Apache Software Foundation (ASF) under one or more
>> @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream
>>   }
>>
>>   private void grow(int newLength) {
>> -    buffer = Arrays.copyOf(buffer, newLength);
>> +    // It actually should be: (Java 1.7, when its intrinsic on all machines)
>> +    // buffer = Arrays.copyOf(buffer, newLength);
>> +    byte[] newBuffer = new byte[newLength];
>> +    System.arraycopy(buffer, 0, newBuffer, 0, buffer.length);
>> +    buffer = newBuffer;
>>   }
>>
>>   /**
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org