You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Mark Hindess <ma...@googlemail.com> on 2007/10/17 20:34:15 UTC

Re: [classlib][icu] Bringing ICU level up to 3.8

On 17 October 2007 at 16:32, Tim Ellison <t....@gmail.com> wrote:
> Tim Ellison wrote:
> > (Left as an exercise for the reader <g>)
> 
> Feeling a bit guilty about the cop-out...

And so you should ;-)

> I slightly modified Alexei's test case to
>  - include some warm-up encode/decode loops to get the methods
>    jitted to a reasonable level,
>  - read the data into a direct byte buffer, and then into a
>    'regular' byte buffer, e.g. one allocated in the Java heap,
>  - I then looked at the effect of converting a short string (129 chars)

I'd have gone with the following 110 character string myself:

  "I took a speed-reading course and read War and Peace in twenty
  minutes.  It involves Russia." - Woody Allen.

> I was running this on the IBM VME, and here's what I got (below).
>
> Interestingly the Java decoder was faster on the long string than the
> native code.  The others are sufficiently similar to imply to me that
> we should just keep it all in Java.

You mean remove the heuristic and remove the intel-contributed native
code?  I guess that seems reasonable given these results; it would
enable us to reduce the size of the code base (and jre footprint
as discussed elsewhere) and concentrate our efforts on the java
implementation.

Of course, this is rather dependent on us being able to achieve similar
results on DRLVM - so it would be interesting to see these results for
that VM too.

Regards,
 Mark.

> === long string
> 
> Read chars = 3285165
> 10 loops warm up
> 10 loops timed
> 
> Direct ByteBuffer
> 
> Built-in
> <org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 781
> millis
> <org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 571
> millis
> 
> 
> Java Heap Byte Buffer
> 
> Built-in
> <org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 430
> millis
> <org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 521
> millis
> 
> 
> === short string
> 
> Read chars = 129
> 1000 loops warm-up
> 10000 loops timed
> 
> 
> Direct ByteBuffer
> 
> Built-in
> <org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 10
> millis
> <org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 0 millis
> 
> 
> Java Heap Byte Buffer
> 
> Built-in
> <org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 10
> millis
> <org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 0 millis
> 



Re: [classlib][icu] Bringing ICU level up to 3.8

Posted by Tim Ellison <t....@gmail.com>.
Mark Hindess wrote:
> On 17 October 2007 at 16:32, Tim Ellison <t....@gmail.com> wrote:
>> Tim Ellison wrote:
>> I was running this on the IBM VME, and here's what I got (below).
>>
>> Interestingly the Java decoder was faster on the long string than the
>> native code.  The others are sufficiently similar to imply to me that
>> we should just keep it all in Java.
> 
> You mean remove the heuristic and remove the intel-contributed native
> code?  I guess that seems reasonable given these results; it would
> enable us to reduce the size of the code base (and jre footprint
> as discussed elsewhere) and concentrate our efforts on the java
> implementation.

Well I'm not quite there yet.  I was running on the IBM VME and only did
a modicum of testing on my uniprocessor laptop, so I would expect to get
a more compelling case before discarding any existing code.

> Of course, this is rather dependent on us being able to achieve similar
> results on DRLVM - so it would be interesting to see these results for
> that VM too.

Agreed, and on different OS / CPU combinations.

It may be that I am measuring the capabilities of the JIT, which
certainly makes a big difference:

(WinXP, Centrino, large string, w/ warm up cycles, IBM VME)

Jit off, non-direct buffer:

  Decoding time: 2193 millis
  Encoding time: 2634 millis

Jit off, direct buffer:

  Decoding time: 771 millis
  Encoding time: 2624 millis    <-- looks strange


Jit on, direct buffer:
  Decoding time: 751 millis
  Encoding time: 461 millis

Jit on, non-direct bufer:
  Decoding time: 420 millis
  Encoding time: 481 millis


Regards,
Tim