You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Charles Lee <li...@gmail.com> on 2009/03/05 08:58:46 UTC
Re: [jira] Commented: (HARMONY-4196) [classlib][luni]
InputStreamReader can't handle UnicodeBig encoding
Here is the patch look like:
diff --git
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
index 5edc277..88a8da7 100644
---
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
+++
modules/luni/src/test/api/common/org/apache/harmony/luni/tests/java/io/InputStreamReaderTest.java
@@ -55,6 +55,12 @@ public class InputStreamReaderTest extends TestCase {
bytes = new byte[] { '\u001b', '$', 'B', '6', 'e', 'B',
'h',
'\u001b', '(', 'B' };
break;
+ case 3:
+ bytes = new byte[] { (byte) 0xff, (byte) 0xfe };
+ break;
+ case 4:
+ bytes = new byte[] { (byte) 0xfe, (byte) 0xff };
+ break;
}
count = bytes.length;
}
@@ -97,6 +103,8 @@ public class InputStreamReaderTest extends TestCase {
private InputStreamReader reader;
+ private InputStreamReader inUTF16;
+
private final String source = "This is a test message with Unicode
character. \u4e2d\u56fd is China's name in Chinese";
/*
@@ -246,6 +254,20 @@ public class InputStreamReaderTest extends TestCase {
assertEquals(Charset.forName(reader2.getEncoding()), Charset
.forName("utf-8"));
reader2.close();
+ try {
+ InputStream streamIn16 = new LimitedByteArrayInputStream(3);
+ inUTF16 = new InputStreamReader(streamIn16, "UnicodeLittle");
+ inUTF16.close();
+ } catch (UnsupportedEncodingException e) {
+ fail ("Should Support UnicodeLittle");
+ }
+ try {
+ InputStream streamIn16 = new LimitedByteArrayInputStream(4);
+ inUTF16 = new InputStreamReader(streamIn16, "UnicodeBig");
+ inUTF16.close();
+ } catch (UnsupportedEncodingException e) {
+ fail ("Should Support UnicodeBig");
+ }
}
/**
diff --git modules/nio_char/src/main/java/java/nio/charset/Charset.java
modules/nio_char/src/main/java/java/nio/charset/Charset.java
index 7b8d79d..65a2593 100644
--- modules/nio_char/src/main/java/java/nio/charset/Charset.java
+++ modules/nio_char/src/main/java/java/nio/charset/Charset.java
@@ -508,6 +508,9 @@ public abstract class Charset implements
Comparable<Charset> {
* If the desired charset is not supported by this runtime.
*/
public static Charset forName(String charsetName) {
+ if ("UnicodeBig".equalsIgnoreCase(charsetName) ||
"UnicodeLittle".equalsIgnoreCase(charsetName)) {
+ charsetName = "UTF-16";
+ }
Charset c = forNameInternal(charsetName);
if (null == c) {
throw new UnsupportedCharsetException(charsetName);
On Thu, Mar 5, 2009 at 2:41 PM, Li Jing Qin (JIRA) <ji...@apache.org> wrote:
>
> [
> https://issues.apache.org/jira/browse/HARMONY-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679092#action_12679092]
>
> Li Jing Qin commented on HARMONY-4196:
> --------------------------------------
>
> Hey guys, I am doing EUT test for 3.5. This also block the testcase. So I
> decide to fix it.
> I am agree with Paulex to map the UnicodeBig and UnicodeLittle to the
> UTF-16. Here is the similiar tests:
> public final static byte[] BOM_UTF_16BE = {(byte) 0xFE, (byte) 0xFF};
>
> public static void printByteArray(byte[] array) {
> System.out.println("LEN: " + array.length);
> for (byte b : array) {
> System.out.print(Character.forDigit(((b & 0xF0) >>
> 4), 16));
> System.out.print(Character.forDigit((b & 0x0F),
> 16));
> System.out.print(" ");
> }
> System.out.println();
> }
>
> public static InputStream getInputStream(byte[][] contents) {
> int size = 0;
> // computes final array size
> for (int i = 0; i < contents.length; i++)
> size += contents[i].length;
> byte[] full = new byte[size];
> int fullIndex = 0;
> // concatenates all byte arrays
> for (int i = 0; i < contents.length; i++)
> for (int j = 0; j < contents[i].length; j++)
> full[fullIndex++] = contents[i][j];
> return new ByteArrayInputStream(full);
> }
>
> public static void main(String[] args) throws Exception {
> String XML_ROOT_ELEMENT_NO_DECL =
> "<org.eclipse.core.runtime.tests.root-element/>";
> try {
> byte[] bArray =
> XML_ROOT_ELEMENT_NO_DECL.getBytes("UTF-16BE");
> printByteArray(bArray);
> } catch (Exception e) {
> e.printStackTrace();
> }
>
> InputStreamReader reader = new
> InputStreamReader(getInputStream(new byte[][] {BOM_UTF_16BE,
> XML_ROOT_ELEMENT_NO_DECL.getBytes("UTF-16BE")}), "UnicodeBig");
> StringBuilder sb = new StringBuilder();
> int c = -1;
> while ((c = reader.read()) != -1) {
> sb.append((char)c);
> }
> System.out.println("GET:" + sb);
> }
>
> if we change the "UnicodeBig" to the "UTF-16", our harmony could correctly
> parse the stream.
>
> There are two ways to fix this:
> 1. Add the mapping in the InputStreamReader and OutputStreamReader
> 2. Add the mapping in the Charset.forName(), which will let the Charset
> support UnicodeBig and UnicodeLittle.
>
> I would like to choose fix 2. Any consideration is appreciate.
> Patch will be attached later.
>
>
> > [classlib][luni] InputStreamReader can't handle UnicodeBig encoding
> > -------------------------------------------------------------------
> >
> > Key: HARMONY-4196
> > URL: https://issues.apache.org/jira/browse/HARMONY-4196
> > Project: Harmony
> > Issue Type: Bug
> > Components: Classlib
> > Reporter: Vasily Zakharov
> > Assignee: Alexei Zakharov
> > Priority: Minor
> > Attachments: Harmony-4196-InputStreamReader_diagnostics.patch
> >
> >
> > Consider the following simple test:
> > import java.io.*;
> > public class Test {
> > public static void main(String[] args) {
> > try {
> > new InputStreamReader(new ByteArrayInputStream(new byte[]
> {(byte) 0xFE, (byte) 0xFF}), "UnicodeBig");
> > System.out.println("SUCCESS");
> > } catch (Throwable e) {
> > System.out.println("FAIL:");
> > e.printStackTrace(System.out);
> > }
> > }
> > }
> > Output on RI:
> > SUCCESS
> > Output on Harmony (both DRL VM and IBM VM):
> > FAIL:
> > java.io.UnsupportedEncodingException
> > at java.io.InputStreamReader.<init>(InputStreamReader.java:104)
> > at Test.main(Test.java:6)
> > Additional investigation shows that the cause for this exception is:
> > java.nio.charset.UnsupportedCharsetException: The unsupported charset
> name is "UnicodeBig".
> > at java.nio.charset.Charset.forName(Charset.java:564)
> > at java.io.InputStreamReader.<init>(InputStreamReader.java:99)
> > at Test.main(Test.java:5)
> > Interesting point is, the direct call to Charset.forName("UnicodeBig")
> causes the same exception on RI also.
> > So it seems the problem is not in Charset but in InputStreamReader
> itself.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
--
Yours sincerely,
Charles Lee