You are viewing a plain text version of this content. The canonical link for it is here.
Posted to log4j-dev@logging.apache.org by Mikael Ståldal <mi...@magine.com> on 2016/05/18 15:29:16 UTC
Garbage-free string encoding performance with UTF-16 charset
It seems like the new garbage-free string encoding method performs poorly
with the UTF-16 charset.
See AbstractStringLayoutStringEncodingBenchmark in log4j-perf which I just
committed to master branch.
My results, note utf16Encode:
Benchmark Mode Samples Score Error Units
baseline sample 90395 24.754 ± 0.484 ns/op
iso8859_1Encode sample 54514 130.176 ± 2.320 ns/op
iso8859_1GetBytes sample 64464 126.122 ± 1.184 ns/op
usAsciiEncode sample 68833 190.550 ± 1.117 ns/op
usAsciiGetBytes sample 80176 170.556 ± 1.691 ns/op
utf16Encode sample 86597 2013.954 ± 10.551 ns/op
utf16GetBytes sample 63696 386.276 ± 46.024 ns/op
utf8Encode sample 69108 190.773 ± 1.504 ns/op
utf8GetBytes sample 66561 196.247 ± 1.623 ns/op
--
[image: MagineTV]
*Mikael Ståldal*
Senior software developer
*Magine TV*
mikael.staldal@magine.com
Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
Privileged and/or Confidential Information may be contained in this
message. If you are not the addressee indicated in this message
(or responsible for delivery of the message to such a person), you may not
copy or deliver this message to anyone. In such case,
you should destroy this message and kindly notify the sender by reply
email.
Re: Garbage-free string encoding performance with UTF-16 charset
Posted by Gary Gregory <ga...@gmail.com>.
Ditto, I've only seen UTF-16 used for XML documents. All it takes is one
customer though ;-) I do not think we need to hold up a release for this
though.
Gary
On Wed, May 18, 2016 at 9:10 AM, Matt Sicker <bo...@gmail.com> wrote:
> I used UTF-16 to encode an XML file by accident once. That's about the
> extent that I've ever used it.
>
> On 18 May 2016 at 11:08, Mikael Ståldal <mi...@magine.com> wrote:
>
>> Maybe not, if we assume that most users won't use UTF-16.
>>
>> (I don't use UTF-16, and I don't know any specific use case for it. I
>> just thought it would be good to test it.)
>>
>> There is no significant difference for US-ASCII, ISO-8859-1 and UTF-8.
>>
>> On Wed, May 18, 2016 at 6:02 PM, Remko Popma <re...@gmail.com>
>> wrote:
>>
>>> Interesting. I'll take a look tomorrow.
>>> I don't think this is a showstopper though, would you agree?
>>>
>>> On Thu, May 19, 2016 at 12:29 AM, Mikael Ståldal <
>>> mikael.staldal@magine.com> wrote:
>>>
>>>> It seems like the new garbage-free string encoding method performs
>>>> poorly with the UTF-16 charset.
>>>>
>>>> See AbstractStringLayoutStringEncodingBenchmark in log4j-perf which I
>>>> just committed to master branch.
>>>>
>>>> My results, note utf16Encode:
>>>>
>>>> Benchmark Mode Samples Score Error Units
>>>> baseline sample 90395 24.754 ± 0.484 ns/op
>>>> iso8859_1Encode sample 54514 130.176 ± 2.320 ns/op
>>>> iso8859_1GetBytes sample 64464 126.122 ± 1.184 ns/op
>>>> usAsciiEncode sample 68833 190.550 ± 1.117 ns/op
>>>> usAsciiGetBytes sample 80176 170.556 ± 1.691 ns/op
>>>> utf16Encode sample 86597 2013.954 ± 10.551 ns/op
>>>> utf16GetBytes sample 63696 386.276 ± 46.024 ns/op
>>>> utf8Encode sample 69108 190.773 ± 1.504 ns/op
>>>> utf8GetBytes sample 66561 196.247 ± 1.623 ns/op
>>>>
>>>> --
>>>> [image: MagineTV]
>>>>
>>>> *Mikael Ståldal*
>>>> Senior software developer
>>>>
>>>> *Magine TV*
>>>> mikael.staldal@magine.com
>>>> Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
>>>>
>>>> Privileged and/or Confidential Information may be contained in this
>>>> message. If you are not the addressee indicated in this message
>>>> (or responsible for delivery of the message to such a person), you may
>>>> not copy or deliver this message to anyone. In such case,
>>>> you should destroy this message and kindly notify the sender by reply
>>>> email.
>>>>
>>>
>>>
>>
>>
>> --
>> [image: MagineTV]
>>
>> *Mikael Ståldal*
>> Senior software developer
>>
>> *Magine TV*
>> mikael.staldal@magine.com
>> Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
>>
>> Privileged and/or Confidential Information may be contained in this
>> message. If you are not the addressee indicated in this message
>> (or responsible for delivery of the message to such a person), you may
>> not copy or deliver this message to anyone. In such case,
>> you should destroy this message and kindly notify the sender by reply
>> email.
>>
>
>
>
> --
> Matt Sicker <bo...@gmail.com>
>
--
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition
<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory
Re: Garbage-free string encoding performance with UTF-16 charset
Posted by Matt Sicker <bo...@gmail.com>.
I used UTF-16 to encode an XML file by accident once. That's about the
extent that I've ever used it.
On 18 May 2016 at 11:08, Mikael Ståldal <mi...@magine.com> wrote:
> Maybe not, if we assume that most users won't use UTF-16.
>
> (I don't use UTF-16, and I don't know any specific use case for it. I just
> thought it would be good to test it.)
>
> There is no significant difference for US-ASCII, ISO-8859-1 and UTF-8.
>
> On Wed, May 18, 2016 at 6:02 PM, Remko Popma <re...@gmail.com>
> wrote:
>
>> Interesting. I'll take a look tomorrow.
>> I don't think this is a showstopper though, would you agree?
>>
>> On Thu, May 19, 2016 at 12:29 AM, Mikael Ståldal <
>> mikael.staldal@magine.com> wrote:
>>
>>> It seems like the new garbage-free string encoding method performs
>>> poorly with the UTF-16 charset.
>>>
>>> See AbstractStringLayoutStringEncodingBenchmark in log4j-perf which I
>>> just committed to master branch.
>>>
>>> My results, note utf16Encode:
>>>
>>> Benchmark Mode Samples Score Error Units
>>> baseline sample 90395 24.754 ± 0.484 ns/op
>>> iso8859_1Encode sample 54514 130.176 ± 2.320 ns/op
>>> iso8859_1GetBytes sample 64464 126.122 ± 1.184 ns/op
>>> usAsciiEncode sample 68833 190.550 ± 1.117 ns/op
>>> usAsciiGetBytes sample 80176 170.556 ± 1.691 ns/op
>>> utf16Encode sample 86597 2013.954 ± 10.551 ns/op
>>> utf16GetBytes sample 63696 386.276 ± 46.024 ns/op
>>> utf8Encode sample 69108 190.773 ± 1.504 ns/op
>>> utf8GetBytes sample 66561 196.247 ± 1.623 ns/op
>>>
>>> --
>>> [image: MagineTV]
>>>
>>> *Mikael Ståldal*
>>> Senior software developer
>>>
>>> *Magine TV*
>>> mikael.staldal@magine.com
>>> Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
>>>
>>> Privileged and/or Confidential Information may be contained in this
>>> message. If you are not the addressee indicated in this message
>>> (or responsible for delivery of the message to such a person), you may
>>> not copy or deliver this message to anyone. In such case,
>>> you should destroy this message and kindly notify the sender by reply
>>> email.
>>>
>>
>>
>
>
> --
> [image: MagineTV]
>
> *Mikael Ståldal*
> Senior software developer
>
> *Magine TV*
> mikael.staldal@magine.com
> Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
>
> Privileged and/or Confidential Information may be contained in this
> message. If you are not the addressee indicated in this message
> (or responsible for delivery of the message to such a person), you may not
> copy or deliver this message to anyone. In such case,
> you should destroy this message and kindly notify the sender by reply
> email.
>
--
Matt Sicker <bo...@gmail.com>
Re: Garbage-free string encoding performance with UTF-16 charset
Posted by Mikael Ståldal <mi...@magine.com>.
Maybe not, if we assume that most users won't use UTF-16.
(I don't use UTF-16, and I don't know any specific use case for it. I just
thought it would be good to test it.)
There is no significant difference for US-ASCII, ISO-8859-1 and UTF-8.
On Wed, May 18, 2016 at 6:02 PM, Remko Popma <re...@gmail.com> wrote:
> Interesting. I'll take a look tomorrow.
> I don't think this is a showstopper though, would you agree?
>
> On Thu, May 19, 2016 at 12:29 AM, Mikael Ståldal <
> mikael.staldal@magine.com> wrote:
>
>> It seems like the new garbage-free string encoding method performs poorly
>> with the UTF-16 charset.
>>
>> See AbstractStringLayoutStringEncodingBenchmark in log4j-perf which I
>> just committed to master branch.
>>
>> My results, note utf16Encode:
>>
>> Benchmark Mode Samples Score Error Units
>> baseline sample 90395 24.754 ± 0.484 ns/op
>> iso8859_1Encode sample 54514 130.176 ± 2.320 ns/op
>> iso8859_1GetBytes sample 64464 126.122 ± 1.184 ns/op
>> usAsciiEncode sample 68833 190.550 ± 1.117 ns/op
>> usAsciiGetBytes sample 80176 170.556 ± 1.691 ns/op
>> utf16Encode sample 86597 2013.954 ± 10.551 ns/op
>> utf16GetBytes sample 63696 386.276 ± 46.024 ns/op
>> utf8Encode sample 69108 190.773 ± 1.504 ns/op
>> utf8GetBytes sample 66561 196.247 ± 1.623 ns/op
>>
>> --
>> [image: MagineTV]
>>
>> *Mikael Ståldal*
>> Senior software developer
>>
>> *Magine TV*
>> mikael.staldal@magine.com
>> Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
>>
>> Privileged and/or Confidential Information may be contained in this
>> message. If you are not the addressee indicated in this message
>> (or responsible for delivery of the message to such a person), you may
>> not copy or deliver this message to anyone. In such case,
>> you should destroy this message and kindly notify the sender by reply
>> email.
>>
>
>
--
[image: MagineTV]
*Mikael Ståldal*
Senior software developer
*Magine TV*
mikael.staldal@magine.com
Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
Privileged and/or Confidential Information may be contained in this
message. If you are not the addressee indicated in this message
(or responsible for delivery of the message to such a person), you may not
copy or deliver this message to anyone. In such case,
you should destroy this message and kindly notify the sender by reply
email.
Re: Garbage-free string encoding performance with UTF-16 charset
Posted by Remko Popma <re...@gmail.com>.
Interesting. I'll take a look tomorrow.
I don't think this is a showstopper though, would you agree?
On Thu, May 19, 2016 at 12:29 AM, Mikael Ståldal <mi...@magine.com>
wrote:
> It seems like the new garbage-free string encoding method performs poorly
> with the UTF-16 charset.
>
> See AbstractStringLayoutStringEncodingBenchmark in log4j-perf which I
> just committed to master branch.
>
> My results, note utf16Encode:
>
> Benchmark Mode Samples Score Error Units
> baseline sample 90395 24.754 ± 0.484 ns/op
> iso8859_1Encode sample 54514 130.176 ± 2.320 ns/op
> iso8859_1GetBytes sample 64464 126.122 ± 1.184 ns/op
> usAsciiEncode sample 68833 190.550 ± 1.117 ns/op
> usAsciiGetBytes sample 80176 170.556 ± 1.691 ns/op
> utf16Encode sample 86597 2013.954 ± 10.551 ns/op
> utf16GetBytes sample 63696 386.276 ± 46.024 ns/op
> utf8Encode sample 69108 190.773 ± 1.504 ns/op
> utf8GetBytes sample 66561 196.247 ± 1.623 ns/op
>
> --
> [image: MagineTV]
>
> *Mikael Ståldal*
> Senior software developer
>
> *Magine TV*
> mikael.staldal@magine.com
> Grev Turegatan 3 | 114 46 Stockholm, Sweden | www.magine.com
>
> Privileged and/or Confidential Information may be contained in this
> message. If you are not the addressee indicated in this message
> (or responsible for delivery of the message to such a person), you may not
> copy or deliver this message to anyone. In such case,
> you should destroy this message and kindly notify the sender by reply
> email.
>