You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Oliver Deakin <ol...@googlemail.com> on 2009/05/01 10:33:15 UTC

Re: svn commit: r770302 - /harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java

In the Setup() method we write to the stream using

fos.write(fileString.getBytes());

but getBytes() converts the string into the native encoding for the 
platform, so when we read the data back in we want to convert it back 
from the native encoding before we try to do any comparison with UTF-8 
chars.

Regards,
Oliver

Nathan Beyer wrote:
> I'm curious about this change. There's no declaration of UTF-8 as the
> encoding, how is that getting set? AIUI the InputStreamReader will use
> the default encoding of the operating system.
>
> -Nathan
>
> On Thu, Apr 30, 2009 at 10:54 AM,  <od...@apache.org> wrote:
>   
>> Author: odeakin
>> Date: Thu Apr 30 15:54:29 2009
>> New Revision: 770302
>>
>> URL: http://svn.apache.org/viewvc?rev=770302&view=rev
>> Log:
>> Minor change to ensure we read test data back in UTF-8.
>>
>> Modified:
>>    harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>
>> Modified: harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>> URL: http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java?rev=770302&r1=770301&r2=770302&view=diff
>> ==============================================================================
>> --- harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java (original)
>> +++ harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java Thu Apr 30 15:54:29 2009
>> @@ -19,6 +19,7 @@
>>
>>  import java.io.File;
>>  import java.io.FileInputStream;
>> +import java.io.InputStreamReader;
>>  import java.io.FileOutputStream;
>>  import java.io.FilePermission;
>>  import java.io.IOException;
>> @@ -109,9 +110,9 @@
>>      * @tests java.io.FileInputStream#read()
>>      */
>>     public void test_read() throws IOException {
>> -        is = new FileInputStream(fileName);
>> -        int c = is.read();
>> -        is.close();
>> +        InputStreamReader isr = new InputStreamReader(new FileInputStream(fileName));
>> +        int c = isr.read();
>> +        isr.close();
>>         assertTrue("read returned incorrect char", c == fileString.charAt(0));
>>     }
>>
>>
>>
>>
>>     
>
>   

-- 
Oliver Deakin
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: svn commit: r770302 - /harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java

Posted by Oliver Deakin <ol...@googlemail.com>.
Nathan Beyer wrote:
> On Sun, May 3, 2009 at 10:29 AM, Oliver Deakin
> <ol...@googlemail.com> wrote:
>   
>> Ah right - yes maybe the comment should have read "_into_ UTF-8". I had the
>> assertTrue() just below my change in mind when I wrote it:
>>
>>  assertTrue("read returned incorrect char", c == fileString.charAt(0));
>>
>> I believe the fileString.charAt(0) will return a UTF-8 encoded character
>> here, and the InputStreamReader is also doing a conversion from the platform
>> default encoding into UTF-8, then the assert is carrying out a straight
>>     
>
> A Java char isn't a UTF-8 value, it's a UTF-16 value, at least as of
> Java 5 [1]. Prior to Java 5, a char was a Unicode code point.
>   

Ah ok - I had it in my my mind that Java used UTF-8 for Java strings/chars.



> [1] http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
>
>
>
>   
>> value comparison between the two characters. It was the conversion from the
>> platform default encoding that I was referring to. Hope that's made it
>> clearer.
>>
>> Regards,
>> Oliver
>>
>>
>> Nathan Beyer wrote:
>>     
>>> I'm still missing it - what does this have to do with UTF-8?
>>>
>>> fileString.getBytes() will return a byte[] encoded using the platform
>>> default
>>>
>>> new InputStreamReader(new FileInputStream(fileName)) will open a file
>>> and read the bytes using the platform default
>>>
>>> I see that there's now symmetry, but it's not about UTF-8. On Windows,
>>> all of this is happening with Windows-1252.
>>>
>>> I'm just confused about the comment - I don't disagree with the code
>>> change.
>>>
>>> -Nathan
>>>
>>> On Fri, May 1, 2009 at 3:33 AM, Oliver Deakin
>>> <ol...@googlemail.com> wrote:
>>>
>>>       
>>>> In the Setup() method we write to the stream using
>>>>
>>>> fos.write(fileString.getBytes());
>>>>
>>>> but getBytes() converts the string into the native encoding for the
>>>> platform, so when we read the data back in we want to convert it back
>>>> from
>>>> the native encoding before we try to do any comparison with UTF-8 chars.
>>>>
>>>> Regards,
>>>> Oliver
>>>>
>>>> Nathan Beyer wrote:
>>>>
>>>>         
>>>>> I'm curious about this change. There's no declaration of UTF-8 as the
>>>>> encoding, how is that getting set? AIUI the InputStreamReader will use
>>>>> the default encoding of the operating system.
>>>>>
>>>>> -Nathan
>>>>>
>>>>> On Thu, Apr 30, 2009 at 10:54 AM,  <od...@apache.org> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> Author: odeakin
>>>>>> Date: Thu Apr 30 15:54:29 2009
>>>>>> New Revision: 770302
>>>>>>
>>>>>> URL: http://svn.apache.org/viewvc?rev=770302&view=rev
>>>>>> Log:
>>>>>> Minor change to ensure we read test data back in UTF-8.
>>>>>>
>>>>>> Modified:
>>>>>>
>>>>>>
>>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>>>
>>>>>> Modified:
>>>>>>
>>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>>> URL:
>>>>>>
>>>>>> http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java?rev=770302&r1=770301&r2=770302&view=diff
>>>>>>
>>>>>>
>>>>>> ==============================================================================
>>>>>> ---
>>>>>>
>>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>>> (original)
>>>>>> +++
>>>>>>
>>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>>> Thu Apr 30 15:54:29 2009
>>>>>> @@ -19,6 +19,7 @@
>>>>>>
>>>>>>  import java.io.File;
>>>>>>  import java.io.FileInputStream;
>>>>>> +import java.io.InputStreamReader;
>>>>>>  import java.io.FileOutputStream;
>>>>>>  import java.io.FilePermission;
>>>>>>  import java.io.IOException;
>>>>>> @@ -109,9 +110,9 @@
>>>>>>    * @tests java.io.FileInputStream#read()
>>>>>>    */
>>>>>>   public void test_read() throws IOException {
>>>>>> -        is = new FileInputStream(fileName);
>>>>>> -        int c = is.read();
>>>>>> -        is.close();
>>>>>> +        InputStreamReader isr = new InputStreamReader(new
>>>>>> FileInputStream(fileName));
>>>>>> +        int c = isr.read();
>>>>>> +        isr.close();
>>>>>>       assertTrue("read returned incorrect char", c ==
>>>>>> fileString.charAt(0));
>>>>>>   }
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>           
>>>> --
>>>> Oliver Deakin
>>>> Unless stated otherwise above:
>>>> IBM United Kingdom Limited - Registered in England and Wales with number
>>>> 741598. Registered office: PO Box 41, North Harbour, Portsmouth,
>>>> Hampshire
>>>> PO6 3AU
>>>>
>>>>
>>>>
>>>>         
>>>       
>> --
>> Oliver Deakin
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6 3AU
>>
>>
>>     
>
>   

-- 
Oliver Deakin
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: svn commit: r770302 - /harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java

Posted by Nathan Beyer <nd...@apache.org>.
On Sun, May 3, 2009 at 10:29 AM, Oliver Deakin
<ol...@googlemail.com> wrote:
> Ah right - yes maybe the comment should have read "_into_ UTF-8". I had the
> assertTrue() just below my change in mind when I wrote it:
>
>  assertTrue("read returned incorrect char", c == fileString.charAt(0));
>
> I believe the fileString.charAt(0) will return a UTF-8 encoded character
> here, and the InputStreamReader is also doing a conversion from the platform
> default encoding into UTF-8, then the assert is carrying out a straight

A Java char isn't a UTF-8 value, it's a UTF-16 value, at least as of
Java 5 [1]. Prior to Java 5, a char was a Unicode code point.

[1] http://java.sun.com/developer/technicalArticles/Intl/Supplementary/



> value comparison between the two characters. It was the conversion from the
> platform default encoding that I was referring to. Hope that's made it
> clearer.
>
> Regards,
> Oliver
>
>
> Nathan Beyer wrote:
>>
>> I'm still missing it - what does this have to do with UTF-8?
>>
>> fileString.getBytes() will return a byte[] encoded using the platform
>> default
>>
>> new InputStreamReader(new FileInputStream(fileName)) will open a file
>> and read the bytes using the platform default
>>
>> I see that there's now symmetry, but it's not about UTF-8. On Windows,
>> all of this is happening with Windows-1252.
>>
>> I'm just confused about the comment - I don't disagree with the code
>> change.
>>
>> -Nathan
>>
>> On Fri, May 1, 2009 at 3:33 AM, Oliver Deakin
>> <ol...@googlemail.com> wrote:
>>
>>>
>>> In the Setup() method we write to the stream using
>>>
>>> fos.write(fileString.getBytes());
>>>
>>> but getBytes() converts the string into the native encoding for the
>>> platform, so when we read the data back in we want to convert it back
>>> from
>>> the native encoding before we try to do any comparison with UTF-8 chars.
>>>
>>> Regards,
>>> Oliver
>>>
>>> Nathan Beyer wrote:
>>>
>>>>
>>>> I'm curious about this change. There's no declaration of UTF-8 as the
>>>> encoding, how is that getting set? AIUI the InputStreamReader will use
>>>> the default encoding of the operating system.
>>>>
>>>> -Nathan
>>>>
>>>> On Thu, Apr 30, 2009 at 10:54 AM,  <od...@apache.org> wrote:
>>>>
>>>>
>>>>>
>>>>> Author: odeakin
>>>>> Date: Thu Apr 30 15:54:29 2009
>>>>> New Revision: 770302
>>>>>
>>>>> URL: http://svn.apache.org/viewvc?rev=770302&view=rev
>>>>> Log:
>>>>> Minor change to ensure we read test data back in UTF-8.
>>>>>
>>>>> Modified:
>>>>>
>>>>>
>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>>
>>>>> Modified:
>>>>>
>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>> URL:
>>>>>
>>>>> http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java?rev=770302&r1=770301&r2=770302&view=diff
>>>>>
>>>>>
>>>>> ==============================================================================
>>>>> ---
>>>>>
>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>> (original)
>>>>> +++
>>>>>
>>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>> Thu Apr 30 15:54:29 2009
>>>>> @@ -19,6 +19,7 @@
>>>>>
>>>>>  import java.io.File;
>>>>>  import java.io.FileInputStream;
>>>>> +import java.io.InputStreamReader;
>>>>>  import java.io.FileOutputStream;
>>>>>  import java.io.FilePermission;
>>>>>  import java.io.IOException;
>>>>> @@ -109,9 +110,9 @@
>>>>>    * @tests java.io.FileInputStream#read()
>>>>>    */
>>>>>   public void test_read() throws IOException {
>>>>> -        is = new FileInputStream(fileName);
>>>>> -        int c = is.read();
>>>>> -        is.close();
>>>>> +        InputStreamReader isr = new InputStreamReader(new
>>>>> FileInputStream(fileName));
>>>>> +        int c = isr.read();
>>>>> +        isr.close();
>>>>>       assertTrue("read returned incorrect char", c ==
>>>>> fileString.charAt(0));
>>>>>   }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Oliver Deakin
>>> Unless stated otherwise above:
>>> IBM United Kingdom Limited - Registered in England and Wales with number
>>> 741598. Registered office: PO Box 41, North Harbour, Portsmouth,
>>> Hampshire
>>> PO6 3AU
>>>
>>>
>>>
>>
>>
>
> --
> Oliver Deakin
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> PO6 3AU
>
>

Re: svn commit: r770302 - /harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java

Posted by Oliver Deakin <ol...@googlemail.com>.
Ah right - yes maybe the comment should have read "_into_ UTF-8". I had 
the assertTrue() just below my change in mind when I wrote it:

  assertTrue("read returned incorrect char", c == fileString.charAt(0));

I believe the fileString.charAt(0) will return a UTF-8 encoded character 
here, and the InputStreamReader is also doing a conversion from the 
platform default encoding into UTF-8, then the assert is carrying out a 
straight value comparison between the two characters. It was the 
conversion from the platform default encoding that I was referring to. 
Hope that's made it clearer.

Regards,
Oliver


Nathan Beyer wrote:
> I'm still missing it - what does this have to do with UTF-8?
>
> fileString.getBytes() will return a byte[] encoded using the platform default
>
> new InputStreamReader(new FileInputStream(fileName)) will open a file
> and read the bytes using the platform default
>
> I see that there's now symmetry, but it's not about UTF-8. On Windows,
> all of this is happening with Windows-1252.
>
> I'm just confused about the comment - I don't disagree with the code change.
>
> -Nathan
>
> On Fri, May 1, 2009 at 3:33 AM, Oliver Deakin
> <ol...@googlemail.com> wrote:
>   
>> In the Setup() method we write to the stream using
>>
>> fos.write(fileString.getBytes());
>>
>> but getBytes() converts the string into the native encoding for the
>> platform, so when we read the data back in we want to convert it back from
>> the native encoding before we try to do any comparison with UTF-8 chars.
>>
>> Regards,
>> Oliver
>>
>> Nathan Beyer wrote:
>>     
>>> I'm curious about this change. There's no declaration of UTF-8 as the
>>> encoding, how is that getting set? AIUI the InputStreamReader will use
>>> the default encoding of the operating system.
>>>
>>> -Nathan
>>>
>>> On Thu, Apr 30, 2009 at 10:54 AM,  <od...@apache.org> wrote:
>>>
>>>       
>>>> Author: odeakin
>>>> Date: Thu Apr 30 15:54:29 2009
>>>> New Revision: 770302
>>>>
>>>> URL: http://svn.apache.org/viewvc?rev=770302&view=rev
>>>> Log:
>>>> Minor change to ensure we read test data back in UTF-8.
>>>>
>>>> Modified:
>>>>
>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>>
>>>> Modified:
>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>> URL:
>>>> http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java?rev=770302&r1=770301&r2=770302&view=diff
>>>>
>>>> ==============================================================================
>>>> ---
>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>> (original)
>>>> +++
>>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>> Thu Apr 30 15:54:29 2009
>>>> @@ -19,6 +19,7 @@
>>>>
>>>>  import java.io.File;
>>>>  import java.io.FileInputStream;
>>>> +import java.io.InputStreamReader;
>>>>  import java.io.FileOutputStream;
>>>>  import java.io.FilePermission;
>>>>  import java.io.IOException;
>>>> @@ -109,9 +110,9 @@
>>>>     * @tests java.io.FileInputStream#read()
>>>>     */
>>>>    public void test_read() throws IOException {
>>>> -        is = new FileInputStream(fileName);
>>>> -        int c = is.read();
>>>> -        is.close();
>>>> +        InputStreamReader isr = new InputStreamReader(new
>>>> FileInputStream(fileName));
>>>> +        int c = isr.read();
>>>> +        isr.close();
>>>>        assertTrue("read returned incorrect char", c ==
>>>> fileString.charAt(0));
>>>>    }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>       
>> --
>> Oliver Deakin
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number
>> 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6 3AU
>>
>>
>>     
>
>   

-- 
Oliver Deakin
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Re: svn commit: r770302 - /harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java

Posted by Nathan Beyer <nd...@apache.org>.
I'm still missing it - what does this have to do with UTF-8?

fileString.getBytes() will return a byte[] encoded using the platform default

new InputStreamReader(new FileInputStream(fileName)) will open a file
and read the bytes using the platform default

I see that there's now symmetry, but it's not about UTF-8. On Windows,
all of this is happening with Windows-1252.

I'm just confused about the comment - I don't disagree with the code change.

-Nathan

On Fri, May 1, 2009 at 3:33 AM, Oliver Deakin
<ol...@googlemail.com> wrote:
> In the Setup() method we write to the stream using
>
> fos.write(fileString.getBytes());
>
> but getBytes() converts the string into the native encoding for the
> platform, so when we read the data back in we want to convert it back from
> the native encoding before we try to do any comparison with UTF-8 chars.
>
> Regards,
> Oliver
>
> Nathan Beyer wrote:
>>
>> I'm curious about this change. There's no declaration of UTF-8 as the
>> encoding, how is that getting set? AIUI the InputStreamReader will use
>> the default encoding of the operating system.
>>
>> -Nathan
>>
>> On Thu, Apr 30, 2009 at 10:54 AM,  <od...@apache.org> wrote:
>>
>>>
>>> Author: odeakin
>>> Date: Thu Apr 30 15:54:29 2009
>>> New Revision: 770302
>>>
>>> URL: http://svn.apache.org/viewvc?rev=770302&view=rev
>>> Log:
>>> Minor change to ensure we read test data back in UTF-8.
>>>
>>> Modified:
>>>
>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>>
>>> Modified:
>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>> URL:
>>> http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java?rev=770302&r1=770301&r2=770302&view=diff
>>>
>>> ==============================================================================
>>> ---
>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>> (original)
>>> +++
>>> harmony/enhanced/classlib/trunk/modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/FileInputStreamTest.java
>>> Thu Apr 30 15:54:29 2009
>>> @@ -19,6 +19,7 @@
>>>
>>>  import java.io.File;
>>>  import java.io.FileInputStream;
>>> +import java.io.InputStreamReader;
>>>  import java.io.FileOutputStream;
>>>  import java.io.FilePermission;
>>>  import java.io.IOException;
>>> @@ -109,9 +110,9 @@
>>>     * @tests java.io.FileInputStream#read()
>>>     */
>>>    public void test_read() throws IOException {
>>> -        is = new FileInputStream(fileName);
>>> -        int c = is.read();
>>> -        is.close();
>>> +        InputStreamReader isr = new InputStreamReader(new
>>> FileInputStream(fileName));
>>> +        int c = isr.read();
>>> +        isr.close();
>>>        assertTrue("read returned incorrect char", c ==
>>> fileString.charAt(0));
>>>    }
>>>
>>>
>>>
>>>
>>>
>>
>>
>
> --
> Oliver Deakin
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number
> 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> PO6 3AU
>
>