You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "Li Jing Qin (JIRA)" <ji...@apache.org> on 2009/03/05 07:41:59 UTC
[jira] Commented: (HARMONY-4196) [classlib][luni] InputStreamReader
can't handle UnicodeBig encoding
[ https://issues.apache.org/jira/browse/HARMONY-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679092#action_12679092 ]
Li Jing Qin commented on HARMONY-4196:
--------------------------------------
Hey guys, I am doing EUT test for 3.5. This also block the testcase. So I decide to fix it.
I am agree with Paulex to map the UnicodeBig and UnicodeLittle to the UTF-16. Here is the similiar tests:
public final static byte[] BOM_UTF_16BE = {(byte) 0xFE, (byte) 0xFF};
public static void printByteArray(byte[] array) {
System.out.println("LEN: " + array.length);
for (byte b : array) {
System.out.print(Character.forDigit(((b & 0xF0) >> 4), 16));
System.out.print(Character.forDigit((b & 0x0F), 16));
System.out.print(" ");
}
System.out.println();
}
public static InputStream getInputStream(byte[][] contents) {
int size = 0;
// computes final array size
for (int i = 0; i < contents.length; i++)
size += contents[i].length;
byte[] full = new byte[size];
int fullIndex = 0;
// concatenates all byte arrays
for (int i = 0; i < contents.length; i++)
for (int j = 0; j < contents[i].length; j++)
full[fullIndex++] = contents[i][j];
return new ByteArrayInputStream(full);
}
public static void main(String[] args) throws Exception {
String XML_ROOT_ELEMENT_NO_DECL = "<org.eclipse.core.runtime.tests.root-element/>";
try {
byte[] bArray = XML_ROOT_ELEMENT_NO_DECL.getBytes("UTF-16BE");
printByteArray(bArray);
} catch (Exception e) {
e.printStackTrace();
}
InputStreamReader reader = new InputStreamReader(getInputStream(new byte[][] {BOM_UTF_16BE, XML_ROOT_ELEMENT_NO_DECL.getBytes("UTF-16BE")}), "UnicodeBig");
StringBuilder sb = new StringBuilder();
int c = -1;
while ((c = reader.read()) != -1) {
sb.append((char)c);
}
System.out.println("GET:" + sb);
}
if we change the "UnicodeBig" to the "UTF-16", our harmony could correctly parse the stream.
There are two ways to fix this:
1. Add the mapping in the InputStreamReader and OutputStreamReader
2. Add the mapping in the Charset.forName(), which will let the Charset support UnicodeBig and UnicodeLittle.
I would like to choose fix 2. Any consideration is appreciate.
Patch will be attached later.
> [classlib][luni] InputStreamReader can't handle UnicodeBig encoding
> -------------------------------------------------------------------
>
> Key: HARMONY-4196
> URL: https://issues.apache.org/jira/browse/HARMONY-4196
> Project: Harmony
> Issue Type: Bug
> Components: Classlib
> Reporter: Vasily Zakharov
> Assignee: Alexei Zakharov
> Priority: Minor
> Attachments: Harmony-4196-InputStreamReader_diagnostics.patch
>
>
> Consider the following simple test:
> import java.io.*;
> public class Test {
> public static void main(String[] args) {
> try {
> new InputStreamReader(new ByteArrayInputStream(new byte[] {(byte) 0xFE, (byte) 0xFF}), "UnicodeBig");
> System.out.println("SUCCESS");
> } catch (Throwable e) {
> System.out.println("FAIL:");
> e.printStackTrace(System.out);
> }
> }
> }
> Output on RI:
> SUCCESS
> Output on Harmony (both DRL VM and IBM VM):
> FAIL:
> java.io.UnsupportedEncodingException
> at java.io.InputStreamReader.<init>(InputStreamReader.java:104)
> at Test.main(Test.java:6)
> Additional investigation shows that the cause for this exception is:
> java.nio.charset.UnsupportedCharsetException: The unsupported charset name is "UnicodeBig".
> at java.nio.charset.Charset.forName(Charset.java:564)
> at java.io.InputStreamReader.<init>(InputStreamReader.java:99)
> at Test.main(Test.java:5)
> Interesting point is, the direct call to Charset.forName("UnicodeBig") causes the same exception on RI also.
> So it seems the problem is not in Charset but in InputStreamReader itself.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (HARMONY-4196) [classlib][luni]
InputStreamReader can't handle UnicodeBig encoding
Posted by Charles Lee <li...@gmail.com>.
Here is the patch look like:
diff --git
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
index 5edc277..88a8da7 100644
---
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
+++
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
@@ -55,6 +55,12 @@ public class InputStreamReaderTest extends TestCase {
bytes = new byte[] { '\u001b', '$', 'B', '6', 'e', 'B',
'h',
'\u001b', '(', 'B' };
break;
+ case 3:
+ bytes = new byte[] { (byte) 0xff, (byte) 0xfe };
+ break;
+ case 4:
+ bytes = new byte[] { (byte) 0xfe, (byte) 0xff };
+ break;
}
count = bytes.length;
}
@@ -97,6 +103,8 @@ public class InputStreamReaderTest extends TestCase {
private InputStreamReader reader;
+ private InputStreamReader inUTF16;
+
private final String source = "This is a test message with Unicode
character. \u4e2d\u56fd is China's name in Chinese";
/*
@@ -246,6 +254,20 @@ public class InputStreamReaderTest extends TestCase {
assertEquals(Charset.forName(reader2.getEncoding()), Charset
.forName("utf-8"));
reader2.close();
+ try {
+ InputStream streamIn16 = new LimitedByteArrayInputStream(3);
+ inUTF16 = new InputStreamReader(streamIn16, "UnicodeLittle");
+ inUTF16.close();
+ } catch (UnsupportedEncodingException e) {
+ fail ("Should Support UnicodeLittle");
+ }
+ try {
+ InputStream streamIn16 = new LimitedByteArrayInputStream(4);
+ inUTF16 = new InputStreamReader(streamIn16, "UnicodeBig");
+ inUTF16.close();
+ } catch (UnsupportedEncodingException e) {
+ fail ("Should Support UnicodeBig");
+ }
}
/**
diff --git modules/nio_char/src/main/java/java/nio/charset/Charset.java
modules/nio_char/src/main/java/java/nio/charset/Charset.java
index 7b8d79d..65a2593 100644
--- modules/nio_char/src/main/java/java/nio/charset/Charset.java
+++ modules/nio_char/src/main/java/java/nio/charset/Charset.java
@@ -508,6 +508,9 @@ public abstract class Charset implements
Comparable<Charset> {
* If the desired charset is not supported by this runtime.
*/
public static Charset forName(String charsetName) {
+ if ("UnicodeBig".equalsIgnoreCase(charsetName) ||
"UnicodeLittle".equalsIgnoreCase(charsetName)) {
+ charsetName = "UTF-16";
+ }
Charset c = forNameInternal(charsetName);
if (null == c) {
throw new UnsupportedCharsetException(charsetName);
On Thu, Mar 5, 2009 at 2:41 PM, Li Jing Qin (JIRA) <ji...@apache.org> wrote:
>
> [
> https://issues.apache.org/jira/browse/HARMONY-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679092#action_12679092]
>
> Li Jing Qin commented on HARMONY-4196:
> --------------------------------------
>
> Hey guys, I am doing EUT test for 3.5. This also block the testcase. So I
> decide to fix it.
> I am agree with Paulex to map the UnicodeBig and UnicodeLittle to the
> UTF-16. Here is the similiar tests:
> public final static byte[] BOM_UTF_16BE = {(byte) 0xFE, (byte) 0xFF};
>
> public static void printByteArray(byte[] array) {
> System.out.println("LEN: " + array.length);
> for (byte b : array) {
> System.out.print(Character.forDigit(((b & 0xF0) >>
> 4), 16));
> System.out.print(Character.forDigit((b & 0x0F),
> 16));
> System.out.print(" ");
> }
> System.out.println();
> }
>
> public static InputStream getInputStream(byte[][] contents) {
> int size = 0;
> // computes final array size
> for (int i = 0; i < contents.length; i++)
> size += contents[i].length;
> byte[] full = new byte[size];
> int fullIndex = 0;
> // concatenates all byte arrays
> for (int i = 0; i < contents.length; i++)
> for (int j = 0; j < contents[i].length; j++)
> full[fullIndex++] = contents[i][j];
> return new ByteArrayInputStream(full);
> }
>
> public static void main(String[] args) throws Exception {
> String XML_ROOT_ELEMENT_NO_DECL =
> "<org.eclipse.core.runtime.tests.root-element/>";
> try {
> byte[] bArray =
> XML_ROOT_ELEMENT_NO_DECL.getBytes("UTF-16BE");
> printByteArray(bArray);
> } catch (Exception e) {
> e.printStackTrace();
> }
>
> InputStreamReader reader = new
> InputStreamReader(getInputStream(new byte[][] {BOM_UTF_16BE,
> XML_ROOT_ELEMENT_NO_DECL.getBytes("UTF-16BE")}), "UnicodeBig");
> StringBuilder sb = new StringBuilder();
> int c = -1;
> while ((c = reader.read()) != -1) {
> sb.append((char)c);
> }
> System.out.println("GET:" + sb);
> }
>
> if we change the "UnicodeBig" to the "UTF-16", our harmony could correctly
> parse the stream.
>
> There are two ways to fix this:
> 1. Add the mapping in the InputStreamReader and OutputStreamReader
> 2. Add the mapping in the Charset.forName(), which will let the Charset
> support UnicodeBig and UnicodeLittle.
>
> I would like to choose fix 2. Any consideration is appreciate.
> Patch will be attached later.
>
>
> > [classlib][luni] InputStreamReader can't handle UnicodeBig encoding
> > -------------------------------------------------------------------
> >
> > Key: HARMONY-4196
> > URL: https://issues.apache.org/jira/browse/HARMONY-4196
> > Project: Harmony
> > Issue Type: Bug
> > Components: Classlib
> > Reporter: Vasily Zakharov
> > Assignee: Alexei Zakharov
> > Priority: Minor
> > Attachments: Harmony-4196-InputStreamReader_diagnostics.patch
> >
> >
> > Consider the following simple test:
> > import java.io.*;
> > public class Test {
> > public static void main(String[] args) {
> > try {
> > new InputStreamReader(new ByteArrayInputStream(new byte[]
> {(byte) 0xFE, (byte) 0xFF}), "UnicodeBig");
> > System.out.println("SUCCESS");
> > } catch (Throwable e) {
> > System.out.println("FAIL:");
> > e.printStackTrace(System.out);
> > }
> > }
> > }
> > Output on RI:
> > SUCCESS
> > Output on Harmony (both DRL VM and IBM VM):
> > FAIL:
> > java.io.UnsupportedEncodingException
> > at java.io.InputStreamReader.<init>(InputStreamReader.java:104)
> > at Test.main(Test.java:6)
> > Additional investigation shows that the cause for this exception is:
> > java.nio.charset.UnsupportedCharsetException: The unsupported charset
> name is "UnicodeBig".
> > at java.nio.charset.Charset.forName(Charset.java:564)
> > at java.io.InputStreamReader.<init>(InputStreamReader.java:99)
> > at Test.main(Test.java:5)
> > Interesting point is, the direct call to Charset.forName("UnicodeBig")
> causes the same exception on RI also.
> > So it seems the problem is not in Charset but in InputStreamReader
> itself.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
--
Yours sincerely,
Charles Lee