You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by kl...@apache.org on 2003/12/02 18:46:01 UTC
cvs commit: jakarta-poi/src/testcases/org/apache/poi/hpsf/data TestChineseProperties.doc
klute 2003/12/02 09:46:01
Modified: src/documentation/content/xdocs changes.xml
src/documentation/content/xdocs/hpsf how-to.xml
internals.xml todo.xml
src/examples/src/org/apache/poi/hpsf/examples
CopyCompare.java WriteAuthorAndTitle.java
src/java/org/apache/poi/hpsf MutableProperty.java
MutableSection.java Property.java PropertySet.java
Section.java TypeWriter.java VariantSupport.java
src/testcases/org/apache/poi/hpsf/basic TestWrite.java
Added: src/testcases/org/apache/poi/hpsf/data
TestChineseProperties.doc
Log:
HPSF: codepage support added
Revision Changes Path
1.7 +4 -0 jakarta-poi/src/documentation/content/xdocs/changes.xml
Index: changes.xml
===================================================================
RCS file: /home/cvs/jakarta-poi/src/documentation/content/xdocs/changes.xml,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -r1.6 -r1.7
--- changes.xml 5 Aug 2003 04:00:13 -0000 1.6
+++ changes.xml 2 Dec 2003 17:46:00 -0000 1.7
@@ -12,7 +12,11 @@
<person id="MJ" name="Marc Johnson" email="mjohnson@apache.org"/>
<person id="NKB" name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
<person id="POI-DEVELOPERS" name="POI Developers" email="poi-dev@jakarta.apache.org"/>
+ <person id="RK" name="Rainer Klute" email="klute@apache.org"/>
</devs>
+ <release version="2.0-pre3" date="unreleased">
+ <action dev="RK" type="add">HPSF: Much better codepage support</action>
+ </release>
<release version="2.0-pre1" date="unreleased">
<action dev="POI-DEVELOPERS" type="add">Patch applied for deep cloning of worksheets was provided</action>
<action dev="POI-DEVELOPERS" type="add">Patch applied to allow sheet reordering</action>
1.10 +30 -13 jakarta-poi/src/documentation/content/xdocs/hpsf/how-to.xml
Index: how-to.xml
===================================================================
RCS file: /home/cvs/jakarta-poi/src/documentation/content/xdocs/hpsf/how-to.xml,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -r1.9 -r1.10
--- how-to.xml 20 Sep 2003 15:43:07 -0000 1.9
+++ how-to.xml 2 Dec 2003 17:46:00 -0000 1.10
@@ -708,8 +708,9 @@
<td>The property's value is the number of a <strong>codepage</strong>,
i.e. a mapping from character codes to characters. All strings in the
section containing this property must be interpreted using this
- codepage. Typical property values are 1252 (8-bit "western" characters)
- or 1200 (16-bit Unicode characters).</td>
+ codepage. Typical property values are 1252 (8-bit "western" characters,
+ ISO-8859-1), 1200 (16-bit Unicode characters, UFT-16), or 65001 (8-bit
+ Unicode characters, UFT-8).</td>
</tr>
</table>
</section>
@@ -833,18 +834,34 @@
</section>
<section><title>Codepage support</title>
- <fixme author="Rainer Klute">Improve codepage support!</fixme>
<p>The property with ID 1 holds the number of the codepage which was used
- to encode the strings in this section. The present HPSF codepage support
- is still very limited: When reading property value strings, HPSF
- distinguishes between 16-bit characters and 8-bit characters. 16-bit
- characters should be Unicode characters and thus be okay. 8-bit
- characters are interpreted according to the platform's default character
- set. This is fine as long as the document being read has been written on
- a platform with the same default character set. However, if you receive a
- document from another region of the world and want to process it with
- HPSF you are in trouble - unless the creator used Unicode, of course.</p>
+ to encode the strings in this section. If this property is not available
+ in a section, the platform's default character encoding will be
+ used. This works fine as long as the document being read has been written
+ on a platform with the same default character encoding. However, if you
+ receive a document from another region of the world and the codepage is
+ undefined, you are in trouble.</p>
+
+ <p>HPSF's codepage support is as good as the character encoding support of
+ the Java Virtual Machine (JVM) the application runs on. If HPSF
+ encounters a codepage number it assumes that the JVM has a character
+ encoding with a corresponding name. For example, if the codepage is 1252,
+ HPSF uses the character encoding "cp1252" to read or write strings. If
+ the JVM does not have that character encoding installed or if the
+ codepage number is illegal, an UnsupportedEncodingException will be
+ thrown.</p>
+
+ <p>There are two exceptions to the rule that a character encoding's name
+ is derived from the codepage number by prepending the string "cp" to
+ it:</p>
+
+ <dl>
+ <dt>Codepage 1200</dt>
+ <dd>is mapped to the character encoding "UTF-16".</dd>
+ <dt>Codepage 65001</dt>
+ <dd>is mapped to the character encoding "UTF-8".</dd>
+ </dl>
</section>
</section>
1.9 +55 -1 jakarta-poi/src/documentation/content/xdocs/hpsf/internals.xml
Index: internals.xml
===================================================================
RCS file: /home/cvs/jakarta-poi/src/documentation/content/xdocs/hpsf/internals.xml,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -r1.8 -r1.9
--- internals.xml 11 Sep 2003 21:48:47 -0000 1.8
+++ internals.xml 2 Dec 2003 17:46:00 -0000 1.9
@@ -944,6 +944,60 @@
+ <section>
+ <title>The Dictionary</title>
+
+ <p>What a dictionary is good for is explained in the <link
+ href="how-to.html">HPSF HOW-TO</link>. This chapter explains how it is
+ organized internally.</p>
+
+ <p>The dictionary has a simple header consisting of a single UInt value. It
+ tells how many entries the dictionary comprises:</p>
+
+ <table>
+ <tr>
+ <th>Name</th>
+ <th>Data type</th>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>nrEntries</td>
+ <th>UInt</th>
+ <td>Number of dictionary entries</td>
+ </tr>
+ </table>
+
+ <p>The dictionary entries follow the header. Each one looks like this:</p>
+
+ <table>
+ <tr>
+ <th>Name</th>
+ <td>Data type</td>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>key</td>
+ <td>UInt</td>
+ <td>The unique number of this property, i.e. the PID</td>
+ </tr>
+ <tr>
+ <td>length</td>
+ <td>UInt</td>
+ <td>The length of the property name associated with the key</td>
+ </tr>
+ <tr>
+ <td>value</td>
+ <td>String</td>
+ <td>The property's name, terminated with a 0x00 character</td>
+ </tr>
+ </table>
+
+ <p>The entries are not aligned, i.e. each one follows its predecessor
+ without any gap or fill characters.</p>
+ </section>
+
+
+
<section><title>References</title>
<p>In order to assemble the HPSF description I used information publically
1.4 +10 -15 jakarta-poi/src/documentation/content/xdocs/hpsf/todo.xml
Index: todo.xml
===================================================================
RCS file: /home/cvs/jakarta-poi/src/documentation/content/xdocs/hpsf/todo.xml,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- todo.xml 30 Aug 2003 09:19:04 -0000 1.3
+++ todo.xml 2 Dec 2003 17:46:00 -0000 1.4
@@ -21,25 +21,20 @@
information streams.
</li>
<li>
- Add codepage support: Presently the bytes making out the string in a
- property's value are interpreted using the platform's default character
- set.
- </li>
- <li>
- Add resource bundles to
- <code>org.apache.poi.hpsf.wellknown</code> to ease
- localizations. This would be useful for mapping standard property IDs to
- localized strings. Example: The property ID 4 could be mapped to "Author"
- in English or "Verfasser" in German.
+ Add resource bundles to
+ <code>org.apache.poi.hpsf.wellknown</code> to ease
+ localizations. This would be useful for mapping standard property IDs to
+ localized strings. Example: The property ID 4 could be mapped to "Author"
+ in English or "Verfasser" in German.
</li>
<li>
Implement reading functionality for those property types that are not
- yet supported. HPSF should return proper Java types instead of just byte
- arrays.
+ yet supported. HPSF should return proper Java types instead of just byte
+ arrays.
</li>
<li>
- Add WMF to <code>java.awt.Image</code> example code in <link
- href="thumbnails.html">Thumbnail HOW TO</link>.
+ Add WMF to <code>java.awt.Image</code> example code in the <link
+ href="thumbnails.html">Thumbnail HOW-TO</link>.
</li>
</ol>
</section>
1.2 +5 -2 jakarta-poi/src/examples/src/org/apache/poi/hpsf/examples/CopyCompare.java
Index: CopyCompare.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/examples/src/org/apache/poi/hpsf/examples/CopyCompare.java,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- CopyCompare.java 20 Sep 2003 15:43:08 -0000 1.1
+++ CopyCompare.java 2 Dec 2003 17:46:01 -0000 1.2
@@ -558,7 +558,10 @@
* exists. However, since we have full control about directory
* creation we can ensure that this will never happen. */
ex.printStackTrace(System.err);
- throw new RuntimeException(ex);
+ throw new RuntimeException(ex.toString());
+ /* FIXME (2): Replace the previous line by the following once we
+ * no longer need JDK 1.3 compatibility. */
+ // throw new RuntimeException(ex);
}
}
}
1.4 +5 -2 jakarta-poi/src/examples/src/org/apache/poi/hpsf/examples/WriteAuthorAndTitle.java
Index: WriteAuthorAndTitle.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/examples/src/org/apache/poi/hpsf/examples/WriteAuthorAndTitle.java,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- WriteAuthorAndTitle.java 20 Sep 2003 15:43:08 -0000 1.3
+++ WriteAuthorAndTitle.java 2 Dec 2003 17:46:01 -0000 1.4
@@ -444,7 +444,10 @@
* exists. However, since we have full control about directory
* creation we can ensure that this will never happen. */
ex.printStackTrace(System.err);
- throw new RuntimeException(ex);
+ throw new RuntimeException(ex.toString());
+ /* FIXME (2): Replace the previous line by the following once we
+ * no longer need JDK 1.3 compatibility. */
+ // throw new RuntimeException(ex);
}
}
}
1.3 +4 -3 jakarta-poi/src/java/org/apache/poi/hpsf/MutableProperty.java
Index: MutableProperty.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/MutableProperty.java,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- MutableProperty.java 4 Sep 2003 20:15:24 -0000 1.2
+++ MutableProperty.java 2 Dec 2003 17:46:01 -0000 1.3
@@ -80,19 +80,20 @@
* <p>Writes the property to an output stream.</p>
*
* @param out The output stream to write to.
+ * @param codepage The codepage to use for writing non-wide strings
* @return the number of bytes written to the stream
*
* @exception IOException if an I/O error occurs
* @exception WritingNotSupportedException if a variant type is to be
* written that is not yet supported
*/
- public int write(final OutputStream out)
+ public int write(final OutputStream out, final int codepage)
throws IOException, WritingNotSupportedException
{
int length = 0;
long variantType = getType();
length += TypeWriter.writeUIntToStream(out, variantType);
- length += VariantSupport.write(out, variantType, getValue());
+ length += VariantSupport.write(out, variantType, getValue(), codepage);
return length;
}
1.7 +6 -6 jakarta-poi/src/java/org/apache/poi/hpsf/MutableSection.java
Index: MutableSection.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/MutableSection.java,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -r1.6 -r1.7
--- MutableSection.java 23 Oct 2003 20:44:24 -0000 1.6
+++ MutableSection.java 2 Dec 2003 17:46:01 -0000 1.7
@@ -420,16 +420,16 @@
/* If the property ID is not equal 0 we write the property and all
* is fine. However, if it equals 0 we have to write the section's
- * dictionary which does not have a type but just a value. */
+ * dictionary which has an implicit type only and an explicit
+ * value. */
if (id != 0)
/* Write the property and update the position to the next
* property. */
- position += p.write(propertyStream);
+ position += p.write(propertyStream, getCodepage());
else
{
- final Integer codepage =
- (Integer) getProperty(PropertyIDMap.PID_CODEPAGE);
- if (codepage == null)
+ final int codepage = getCodepage();
+ if (codepage == -1)
throw new IllegalPropertySetDataException
("Codepage (property 1) is undefined.");
position += writeDictionary(propertyStream, dictionary);
1.16 +28 -3 jakarta-poi/src/java/org/apache/poi/hpsf/Property.java
Index: Property.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/Property.java,v
retrieving revision 1.15
retrieving revision 1.16
diff -u -r1.15 -r1.16
--- Property.java 18 Sep 2003 18:56:35 -0000 1.15
+++ Property.java 2 Dec 2003 17:46:01 -0000 1.16
@@ -62,9 +62,11 @@
*/
package org.apache.poi.hpsf;
+import java.io.UnsupportedEncodingException;
import java.util.HashMap;
import java.util.Map;
+import org.apache.poi.util.HexDump;
import org.apache.poi.util.LittleEndian;
/**
@@ -161,9 +163,13 @@
* @param length The property's type/value pair's length in bytes.
* @param codepage The section's and thus the property's
* codepage. It is needed only when reading string values.
+ *
+ * @exception UnsupportedEncodingException if the specified codepage is not
+ * supported
*/
public Property(final long id, final byte[] src, final long offset,
final int length, final int codepage)
+ throws UnsupportedEncodingException
{
this.id = id;
@@ -183,7 +189,7 @@
try
{
- value = VariantSupport.read(src, o, length, (int) type);
+ value = VariantSupport.read(src, o, length, (int) type, codepage);
}
catch (UnsupportedVariantTypeException ex)
{
@@ -382,8 +388,27 @@
b.append(getID());
b.append(", type: ");
b.append(getType());
+ final Object value = getValue();
b.append(", value: ");
- b.append(getValue());
+ b.append(value.toString());
+ if (value instanceof String)
+ {
+ final String s = (String) value;
+ final int l = s.length();
+ final byte[] bytes = new byte[l * 2];
+ for (int i = 0; i < l; i++)
+ {
+ final char c = s.charAt(i);
+ final byte high = (byte) ((c & 0x00ff00) >> 8);
+ final byte low = (byte) ((c & 0x0000ff) >> 0);
+ bytes[i * 2] = high;
+ bytes[i * 2 + 1] = low;
+ }
+ final String hex = HexDump.dump(bytes, 0L, 0);
+ b.append(" [");
+ b.append(hex);
+ b.append("]");
+ }
b.append(']');
return b.toString();
}
1.15 +12 -5 jakarta-poi/src/java/org/apache/poi/hpsf/PropertySet.java
Index: PropertySet.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/PropertySet.java,v
retrieving revision 1.14
retrieving revision 1.15
diff -u -r1.14 -r1.15
--- PropertySet.java 23 Oct 2003 20:44:24 -0000 1.14
+++ PropertySet.java 2 Dec 2003 17:46:01 -0000 1.15
@@ -56,6 +56,7 @@
import java.io.IOException;
import java.io.InputStream;
+import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.List;
@@ -300,9 +301,11 @@
* @param length The length of the stream data.
* @throws NoPropertySetStreamException if the byte array is not a
* property set stream.
+ *
+ * @exception UnsupportedEncodingException if the codepage is not supported
*/
public PropertySet(final byte[] stream, final int offset, final int length)
- throws NoPropertySetStreamException
+ throws NoPropertySetStreamException, UnsupportedEncodingException
{
if (isPropertySetStream(stream, offset, length))
init(stream, offset, length);
@@ -321,8 +324,11 @@
* complete byte array contents is the stream data.
* @throws NoPropertySetStreamException if the byte array is not a
* property set stream.
+ *
+ * @exception UnsupportedEncodingException if the codepage is not supported
*/
- public PropertySet(final byte[] stream) throws NoPropertySetStreamException
+ public PropertySet(final byte[] stream)
+ throws NoPropertySetStreamException, UnsupportedEncodingException
{
this(stream, 0, stream.length);
}
@@ -435,6 +441,7 @@
* @param length Length of the property set stream.
*/
private void init(final byte[] src, final int offset, final int length)
+ throws UnsupportedEncodingException
{
/* FIXME (3): Ensure that at most "length" bytes are read. */
@@ -651,7 +658,7 @@
final PropertySet ps = (PropertySet) o;
int byteOrder1 = ps.getByteOrder();
int byteOrder2 = getByteOrder();
- ClassID classId1 = ps.getClassID();
+ ClassID classID1 = ps.getClassID();
ClassID classID2 = getClassID();
int format1 = ps.getFormat();
int format2 = getFormat();
@@ -660,7 +667,7 @@
int sectionCount1 = ps.getSectionCount();
int sectionCount2 = getSectionCount();
if (byteOrder1 != byteOrder2 ||
- !classId1.equals(classID2) ||
+ !classID1.equals(classID2) ||
format1 != format2 ||
osVersion1 != osVersion2 ||
sectionCount1 != sectionCount2)
1.21 +20 -1 jakarta-poi/src/java/org/apache/poi/hpsf/Section.java
Index: Section.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/Section.java,v
retrieving revision 1.20
retrieving revision 1.21
diff -u -r1.20 -r1.21
--- Section.java 23 Oct 2003 20:44:24 -0000 1.20
+++ Section.java 2 Dec 2003 17:46:01 -0000 1.21
@@ -54,6 +54,7 @@
*/
package org.apache.poi.hpsf;
+import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
@@ -193,8 +194,12 @@
* @param src Contains the complete property set stream.
* @param offset The position in the stream that points to the
* section's format ID.
+ *
+ * @exception UnsupportedEncodingException if the section's codepage is not
+ * supported.
*/
public Section(final byte[] src, final int offset)
+ throws UnsupportedEncodingException
{
int o1 = offset;
@@ -636,6 +641,20 @@
public Map getDictionary()
{
return dictionary;
+ }
+
+
+
+ /**
+ * <p>Gets the section's codepage, if any.</p>
+ *
+ * @return The section's codepage if one is defined, else -1.
+ */
+ public int getCodepage()
+ {
+ final Integer codepage =
+ (Integer) getProperty(PropertyIDMap.PID_CODEPAGE);
+ return codepage != null ? codepage.intValue() : -1;
}
}
1.3 +4 -3 jakarta-poi/src/java/org/apache/poi/hpsf/TypeWriter.java
Index: TypeWriter.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/TypeWriter.java,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- TypeWriter.java 30 Aug 2003 09:13:52 -0000 1.2
+++ TypeWriter.java 2 Dec 2003 17:46:01 -0000 1.3
@@ -185,7 +185,8 @@
* @exception IOException if an I/O error occurs
*/
public static void writeToStream(final OutputStream out,
- final Property[] properties)
+ final Property[] properties,
+ final int codepage)
throws IOException, UnsupportedVariantTypeException
{
/* If there are no properties don't write anything. */
@@ -207,7 +208,7 @@
final Property p = (Property) properties[i];
long type = p.getType();
writeUIntToStream(out, type);
- VariantSupport.write(out, (int) type, p.getValue());
+ VariantSupport.write(out, (int) type, p.getValue(), codepage);
}
}
1.6 +62 -26 jakarta-poi/src/java/org/apache/poi/hpsf/VariantSupport.java
Index: VariantSupport.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/VariantSupport.java,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -r1.5 -r1.6
--- VariantSupport.java 23 Oct 2003 20:44:24 -0000 1.5
+++ VariantSupport.java 2 Dec 2003 17:46:01 -0000 1.6
@@ -64,6 +64,7 @@
import java.io.IOException;
import java.io.OutputStream;
+import java.io.UnsupportedEncodingException;
import java.util.Date;
import java.util.LinkedList;
import java.util.List;
@@ -163,17 +164,21 @@
* @param length The length of the variant including the variant
* type field
* @param type The variant type to read
+ * @param codepage The codepage to use to write non-wide strings
* @return A Java object that corresponds best to the variant
* field. For example, a VT_I4 is returned as a {@link Long}, a
* VT_LPSTR as a {@link String}.
* @exception ReadingNotSupportedException if a property is to be written
* who's variant type HPSF does not yet support
+ * @exception UnsupportedEncodingException if the specified codepage is not
+ * supported
*
* @see Variant
*/
public static Object read(final byte[] src, final int offset,
- final int length, final long type)
- throws ReadingNotSupportedException
+ final int length, final long type,
+ final int codepage)
+ throws ReadingNotSupportedException, UnsupportedEncodingException
{
Object value;
int o1 = offset;
@@ -221,18 +226,18 @@
* Read a byte string. In Java it is represented as a
* String object. The 0x00 bytes at the end must be
* stripped.
- *
- * FIXME (2): Reading an 8-bit string should pay attention
- * to the codepage. Currently the byte making out the
- * property's value are interpreted according to the
- * platform's default character set.
*/
final int first = o1 + LittleEndian.INT_SIZE;
long last = first + LittleEndian.getUInt(src, o1) - 1;
o1 += LittleEndian.INT_SIZE;
+ final int rawLength = (int) (last - first + 1);
while (src[(int) last] == 0 && first <= last)
last--;
- value = new String(src, (int) first, (int) (last - first + 1));
+ final int l = (int) (last - first + 1);
+ value = codepage != -1 ?
+ new String(src, (int) first, l,
+ codepageToEncoding(codepage)) :
+ new String(src, (int) first, l);
break;
}
case Variant.VT_LPWSTR:
@@ -299,12 +304,45 @@
/**
+ * <p>Turns a codepage number into the equivalent character encoding's
+ * name.</p>
+ *
+ * @param codepage The codepage number
+ *
+ * @return The character encoding's name. If the codepage number is 65001,
+ * the encoding name is "UTF-8". All other positive numbers are mapped to
+ * "cp" followed by the number, e.g. if the codepage number is 1252 the
+ * returned character encoding name will be "cp1252".
+ *
+ * @exception UnsupportedEncodingException if the specified codepage is
+ * less than zero.
+ */
+ public static String codepageToEncoding(final int codepage)
+ throws UnsupportedEncodingException
+ {
+ if (codepage <= 0)
+ throw new UnsupportedEncodingException
+ ("Codepage number may not be " + codepage);
+ switch (codepage)
+ {
+ case 1200:
+ return "UTF-16";
+ case 65001:
+ return "UTF-8";
+ default:
+ return "cp" + codepage;
+ }
+ }
+
+
+ /**
* <p>Writes a variant value to an output stream. This method ensures that
* always a multiple of 4 bytes is written.</p>
*
* @param out The stream to write the value to.
* @param type The variant's type.
* @param value The variant's value.
+ * @param codepage The codepage to use to write non-wide strings
* @return The number of entities that have been written. In many cases an
* "entity" is a byte but this is not always the case.
* @exception IOException if an I/O exceptions occurs
@@ -312,7 +350,7 @@
* who's variant type HPSF does not yet support
*/
public static int write(final OutputStream out, final long type,
- final Object value)
+ final Object value, final int codepage)
throws IOException, WritingNotSupportedException
{
int length = 0;
@@ -330,16 +368,13 @@
}
case Variant.VT_LPSTR:
{
- length = TypeWriter.writeUIntToStream
- (out, ((String) value).length() + 1);
- char[] s = Util.pad4((String) value);
- /* FIXME (2): The following line forces characters to bytes.
- * This is generally wrong and should only be done according to
- * a codepage. Alternatively Unicode could be written (see
- * Variant.VT_LPWSTR). */
- byte[] b = new byte[s.length + 1];
- for (int i = 0; i < s.length; i++)
- b[i] = (byte) s[i];
+ final byte[] bytes =
+ (codepage == -1 ?
+ ((String) value).getBytes() :
+ ((String) value).getBytes(codepageToEncoding(codepage)));
+ length = TypeWriter.writeUIntToStream(out, bytes.length + 1);
+ final byte[] b = new byte[bytes.length + 1];
+ System.arraycopy(bytes, 0, b, 0, bytes.length);
b[b.length - 1] = 0x00;
out.write(b);
length += b.length;
@@ -419,12 +454,13 @@
}
}
- /* Add 0x00 character to write a multiple of four bytes: */
- while (length % 4 != 0)
- {
- out.write(0);
- length++;
- }
+ /* Add 0x00 characters to write a multiple of four bytes: */
+ // FIXME (1) Try this!
+// while (length % 4 != 0)
+// {
+// out.write(0);
+// length++;
+// }
return length;
}
1.8 +44 -36 jakarta-poi/src/testcases/org/apache/poi/hpsf/basic/TestWrite.java
Index: TestWrite.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/testcases/org/apache/poi/hpsf/basic/TestWrite.java,v
retrieving revision 1.7
retrieving revision 1.8
diff -u -r1.7 -r1.8
--- TestWrite.java 18 Sep 2003 18:56:35 -0000 1.7
+++ TestWrite.java 2 Dec 2003 17:46:01 -0000 1.8
@@ -357,7 +357,10 @@
catch (Exception ex)
{
ex.printStackTrace();
- throw new RuntimeException(ex);
+ throw new RuntimeException(ex.toString());
+ /* FIXME (2): Replace the previous line by the following
+ * one once we no longer need JDK 1.3 compatibility. */
+ // throw new RuntimeException(ex);
}
}
},
@@ -398,37 +401,40 @@
public void testVariantTypes()
{
Throwable t = null;
+ final int codepage = -1;
+ /* FIXME (2): Add tests for various codepages! */
try
{
- check(Variant.VT_EMPTY, null);
- check(Variant.VT_BOOL, new Boolean(true));
- check(Variant.VT_BOOL, new Boolean(false));
- check(Variant.VT_CF, new byte[]{0});
- check(Variant.VT_CF, new byte[]{0, 1});
- check(Variant.VT_CF, new byte[]{0, 1, 2});
- check(Variant.VT_CF, new byte[]{0, 1, 2, 3});
- check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4});
- check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5});
- check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10});
- check(Variant.VT_I2, new Integer(27));
- check(Variant.VT_I4, new Long(28));
- check(Variant.VT_FILETIME, new Date());
- check(Variant.VT_LPSTR, "");
- check(Variant.VT_LPSTR, "�");
- check(Variant.VT_LPSTR, "��");
- check(Variant.VT_LPSTR, "���");
- check(Variant.VT_LPSTR, "����");
- check(Variant.VT_LPSTR, "�����");
- check(Variant.VT_LPSTR, "������");
- check(Variant.VT_LPSTR, "�������");
- check(Variant.VT_LPWSTR, "");
- check(Variant.VT_LPWSTR, "�");
- check(Variant.VT_LPWSTR, "��");
- check(Variant.VT_LPWSTR, "���");
- check(Variant.VT_LPWSTR, "����");
- check(Variant.VT_LPWSTR, "�����");
- check(Variant.VT_LPWSTR, "������");
- check(Variant.VT_LPWSTR, "�������");
+ check(Variant.VT_EMPTY, null, codepage);
+ check(Variant.VT_BOOL, new Boolean(true), codepage);
+ check(Variant.VT_BOOL, new Boolean(false), codepage);
+ check(Variant.VT_CF, new byte[]{0}, codepage);
+ check(Variant.VT_CF, new byte[]{0, 1}, codepage);
+ check(Variant.VT_CF, new byte[]{0, 1, 2}, codepage);
+ check(Variant.VT_CF, new byte[]{0, 1, 2, 3}, codepage);
+ check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4}, codepage);
+ check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5}, codepage);
+ check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
+ codepage);
+ check(Variant.VT_I2, new Integer(27), codepage);
+ check(Variant.VT_I4, new Long(28), codepage);
+ check(Variant.VT_FILETIME, new Date(), codepage);
+ check(Variant.VT_LPSTR, "", codepage);
+ check(Variant.VT_LPSTR, "�", codepage);
+ check(Variant.VT_LPSTR, "��", codepage);
+ check(Variant.VT_LPSTR, "���", codepage);
+ check(Variant.VT_LPSTR, "����", codepage);
+ check(Variant.VT_LPSTR, "�����", codepage);
+ check(Variant.VT_LPSTR, "������", codepage);
+ check(Variant.VT_LPSTR, "�������", codepage);
+ check(Variant.VT_LPWSTR, "", codepage);
+ check(Variant.VT_LPWSTR, "�", codepage);
+ check(Variant.VT_LPWSTR, "��", codepage);
+ check(Variant.VT_LPWSTR, "���", codepage);
+ check(Variant.VT_LPWSTR, "����", codepage);
+ check(Variant.VT_LPWSTR, "�����", codepage);
+ check(Variant.VT_LPWSTR, "������", codepage);
+ check(Variant.VT_LPWSTR, "�������", codepage);
}
catch (Exception ex)
{
@@ -466,20 +472,22 @@
* @throws UnsupportedVariantTypeException if the variant is not supported.
* @throws IOException if an I/O exception occurs.
*/
- private void check(final long variantType, final Object value)
+ private void check(final long variantType, final Object value,
+ final int codepage)
throws UnsupportedVariantTypeException, IOException
{
final ByteArrayOutputStream out = new ByteArrayOutputStream();
- VariantSupport.write(out, variantType, value);
+ VariantSupport.write(out, variantType, value, codepage);
out.close();
final byte[] b = out.toByteArray();
final Object objRead =
VariantSupport.read(b, 0, b.length + LittleEndian.INT_SIZE,
- variantType);
+ variantType, -1);
if (objRead instanceof byte[])
{
- final int diff = diff(org.apache.poi.hpsf.Util.pad4
- ((byte[]) value), (byte[]) objRead);
+// final int diff = diff(org.apache.poi.hpsf.Util.pad4
+// ((byte[]) value), (byte[]) objRead);
+ final int diff = diff((byte[]) value, (byte[]) objRead);
if (diff >= 0)
fail("Byte arrays are different. First different byte is at " +
"index " + diff + ".");
1.1 jakarta-poi/src/testcases/org/apache/poi/hpsf/data/TestChineseProperties.doc
<<Binary file>>
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-dev-help@jakarta.apache.org