You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Tim Ellison <t....@gmail.com> on 2006/06/26 10:54:58 UTC

syncronization in Charset (was: Re: [continuum] BUILD FAILURE: Classlib/linux.ia32 Build/Test)

Paulex Yang wrote:
> Tim,
> 
> Good catch! But I'm afraid it not only a test defect, but also a
> possible Harmony implementation bug.
> 
> The JavaDoc for String says nothing about synchronization, but Charset's
> does: " All of the methods defined in this class are safe for use by
> multiple concurrent threads.". So it should be safe to invoke
> Charset.encode() concurrently.

Yep, so if that had been used then I would have complained differently,
but since String doesn't claim to be thread-safe I moved it out.

> Looking inside the Charset.encode(CharBuffer), it has got lock on
> itself, the cause is the system wide cached CharsetEncoder instance is
> not synchronized[1], so if String.getBytes() always use
> Charset.forName() to get the cached charset instance for some certain
> encoding, the error won't happen. Probably String create a new Charset
> every time.

Agreed, if the String was not using the 'lastCharset' it will create
multiple instances.  Not sure why it would have this churn, unless there
was some IO on another thread or something with a different charset.

> Even so, I think the Charset implementation should be safe
> according to spec, so I suggest to also hold lock on the cached
> CharsetEncoder during Charset.encode() so that this method is safe
> enough to be used concurrently.
> 
> comments?

Yes, all instances of Charset should safely use the cached
encoders/decoders.  You may then also be able to reduce the
synchronization of the Charset itself (it would be redundant, but would
not matter if two instances cache the same encoder simultaneously).

Regards,
Tim


> [1] code fragment from Charset.encode()
> synchronized public final ByteBuffer encode(CharBuffer buffer) {
>       CharsetEncoder e = getCachedCharsetEncoder(canonicalName);
>      ...
>       try {
>             // e is not safe for concurrent use
>            return e.encode(buffer);
>       } catch (CharacterCodingException ex) {
>            throw new Error(ex.getMessage(), ex);
>       }
> }
> 
> Tim Ellison wrote:
>> The failing test does this (amongst other things):
>>
>>         final int THREAD_NUM = 20;
>>         Thread[] thread = new Thread[THREAD_NUM];
>>         for (int i = 0; i < THREAD_NUM; i++) {
>>             thread[i] = new Thread() {
>>                 public void run() {
>>                     try {
>>                         sink
>>                                 .write(ByteBuffer.wrap("bytes"
>> *****                                   .getBytes(ISO8859_1)));
>>                     } catch (IOException e) {
>>                         throw new RuntimeException(e);
>>                     }
>>                 }
>>             };
>>         }
>>         for (int i = 0; i < THREAD_NUM; i++) {
>>             thread[i].start();
>>         }
>>
>> If String#getBytes(String) is not thread safe then we are doomed ;-)
>>
>> How about:
>>
>>
>> @@ -104,14 +104,13 @@
>>      public void test_write_LByteBuffer_mutliThread() throws IOException,
>>              InterruptedException {
>>          final int THREAD_NUM = 20;
>> +        final byte[] strbytes = "bytes".getBytes(ISO8859_1);
>>          Thread[] thread = new Thread[THREAD_NUM];
>>          for (int i = 0; i < THREAD_NUM; i++) {
>>              thread[i] = new Thread() {
>>                  public void run() {
>>                      try {
>> -                        sink
>> -                                .write(ByteBuffer.wrap("bytes"
>> -                                        .getBytes(ISO8859_1)));
>> +                        sink.write(ByteBuffer.wrap(strbytes));
>>                      } catch (IOException e) {
>>                          throw new RuntimeException(e);
>>                      }
>>
>>
>> Regards,
>> Tim
>>
>>
>> Apache Harmony Build wrote:
>> <snip>
>>  
>>>      [exec]     [junit] java.lang.IllegalStateException
>>>      [exec]     [junit]     at
>>> java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:480)
>>>      [exec]     [junit]     at
>>> java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:347)
>>>      [exec]     [junit]     at
>>> java.nio.charset.Charset.encode(Charset.java:639)
>>>      [exec]     [junit]     at
>>> java.lang.String.getBytes(String.java:838)
>>>      [exec]     [junit]     at
>>> org.apache.harmony.tests.java.nio.channels.SinkChannelTest$1.run(SinkChannelTest.java:112)
>>>
>>>      [exec]     [junit] Tests FAILED
>>>     
>>
>>
>>   
> 
> 

-- 

Tim Ellison (t.p.ellison@gmail.com)
IBM Java technology centre, UK.

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org


Re: syncronization in Charset

Posted by Paulex Yang <pa...@gmail.com>.
Tim Ellison wrote:
> Paulex Yang wrote:
>   
>> Tim,
>>
>> Good catch! But I'm afraid it not only a test defect, but also a
>> possible Harmony implementation bug.
>>
>> The JavaDoc for String says nothing about synchronization, but Charset's
>> does: " All of the methods defined in this class are safe for use by
>> multiple concurrent threads.". So it should be safe to invoke
>> Charset.encode() concurrently.
>>     
>
> Yep, so if that had been used then I would have complained differently,
> but since String doesn't claim to be thread-safe I moved it out.
>
>   
>> Looking inside the Charset.encode(CharBuffer), it has got lock on
>> itself, the cause is the system wide cached CharsetEncoder instance is
>> not synchronized[1], so if String.getBytes() always use
>> Charset.forName() to get the cached charset instance for some certain
>> encoding, the error won't happen. Probably String create a new Charset
>> every time.
>>     
>
> Agreed, if the String was not using the 'lastCharset' it will create
> multiple instances.  Not sure why it would have this churn, unless there
> was some IO on another thread or something with a different charset.
>
>   
>> Even so, I think the Charset implementation should be safe
>> according to spec, so I suggest to also hold lock on the cached
>> CharsetEncoder during Charset.encode() so that this method is safe
>> enough to be used concurrently.
>>
>> comments?
>>     
>
> Yes, all instances of Charset should safely use the cached
> encoders/decoders.  You may then also be able to reduce the
> synchronization of the Charset itself (it would be redundant, but would
> not matter if two instances cache the same encoder simultaneously).
>   
+1, the Charset.decode() has same issue, I'll raise a JIRA and create 
patch for this.
> Regards,
> Tim
>
>
>   
>> [1] code fragment from Charset.encode()
>> synchronized public final ByteBuffer encode(CharBuffer buffer) {
>>       CharsetEncoder e = getCachedCharsetEncoder(canonicalName);
>>      ...
>>       try {
>>             // e is not safe for concurrent use
>>            return e.encode(buffer);
>>       } catch (CharacterCodingException ex) {
>>            throw new Error(ex.getMessage(), ex);
>>       }
>> }
>>
>> Tim Ellison wrote:
>>     
>>> The failing test does this (amongst other things):
>>>
>>>         final int THREAD_NUM = 20;
>>>         Thread[] thread = new Thread[THREAD_NUM];
>>>         for (int i = 0; i < THREAD_NUM; i++) {
>>>             thread[i] = new Thread() {
>>>                 public void run() {
>>>                     try {
>>>                         sink
>>>                                 .write(ByteBuffer.wrap("bytes"
>>> *****                                   .getBytes(ISO8859_1)));
>>>                     } catch (IOException e) {
>>>                         throw new RuntimeException(e);
>>>                     }
>>>                 }
>>>             };
>>>         }
>>>         for (int i = 0; i < THREAD_NUM; i++) {
>>>             thread[i].start();
>>>         }
>>>
>>> If String#getBytes(String) is not thread safe then we are doomed ;-)
>>>
>>> How about:
>>>
>>>
>>> @@ -104,14 +104,13 @@
>>>      public void test_write_LByteBuffer_mutliThread() throws IOException,
>>>              InterruptedException {
>>>          final int THREAD_NUM = 20;
>>> +        final byte[] strbytes = "bytes".getBytes(ISO8859_1);
>>>          Thread[] thread = new Thread[THREAD_NUM];
>>>          for (int i = 0; i < THREAD_NUM; i++) {
>>>              thread[i] = new Thread() {
>>>                  public void run() {
>>>                      try {
>>> -                        sink
>>> -                                .write(ByteBuffer.wrap("bytes"
>>> -                                        .getBytes(ISO8859_1)));
>>> +                        sink.write(ByteBuffer.wrap(strbytes));
>>>                      } catch (IOException e) {
>>>                          throw new RuntimeException(e);
>>>                      }
>>>
>>>
>>> Regards,
>>> Tim
>>>
>>>
>>> Apache Harmony Build wrote:
>>> <snip>
>>>  
>>>       
>>>>      [exec]     [junit] java.lang.IllegalStateException
>>>>      [exec]     [junit]     at
>>>> java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:480)
>>>>      [exec]     [junit]     at
>>>> java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:347)
>>>>      [exec]     [junit]     at
>>>> java.nio.charset.Charset.encode(Charset.java:639)
>>>>      [exec]     [junit]     at
>>>> java.lang.String.getBytes(String.java:838)
>>>>      [exec]     [junit]     at
>>>> org.apache.harmony.tests.java.nio.channels.SinkChannelTest$1.run(SinkChannelTest.java:112)
>>>>
>>>>      [exec]     [junit] Tests FAILED
>>>>     
>>>>         
>>>   
>>>       
>>     
>
>   


-- 
Paulex Yang
China Software Development Lab
IBM



---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org