You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by kl...@apache.org on 2003/02/05 20:33:28 UTC
cvs commit: jakarta-poi/src/java/org/apache/poi/hpsf/wellknown PropertyIDMap.java
klute 2003/02/05 11:33:27
Modified: src/documentation/xdocs/hpsf how-to.xml todo.xml
src/java/org/apache/poi/hpsf TypeReader.java
src/java/org/apache/poi/hpsf/wellknown PropertyIDMap.java
Log:
Completed the third main section of the HPSF HOW-TO.
Revision Changes Path
1.13 +352 -119 jakarta-poi/src/documentation/xdocs/hpsf/how-to.xml
Index: how-to.xml
===================================================================
RCS file: /home/cvs/jakarta-poi/src/documentation/xdocs/hpsf/how-to.xml,v
retrieving revision 1.12
retrieving revision 1.13
diff -u -r1.12 -r1.13
--- how-to.xml 2 Feb 2003 20:28:45 -0000 1.12
+++ how-to.xml 5 Feb 2003 19:33:27 -0000 1.13
@@ -33,10 +33,9 @@
</li>
<li>
- <p>The <link href="#sec3">third section</link> tells how to read
+ <p>The <link href="#sec3">third section</link> tells how to read
non-standard properties. Non-standard properties are application-specific
- name/value/type triples. <em>This section is still to be written. Look up
- the API documentation for the time being!</em></p>
+ triples consisting of an ID, a type, and a value.</p>
</li>
</ol>
@@ -303,54 +302,60 @@
<section title="Reading Non-Standard Properties">
<note>This section tells how to read non-standard properties. Non-standard
- properties are application-specific name/type/value triples.</note>
+ properties are application-specific ID/type/value triples.</note>
- <p>Now comes the really hardcode stuff. As mentioned above,
- <code>SummaryInformation</code> and
- <code>DocumentSummaryInformation</code> are just special cases of the
- general concept of a property set. The general concept says that a
- property set consists of <strong>properties</strong>. Each property is an
- entity that has a <strong>name</strong>, a <strong>type</strong>, and a
- <strong>value</strong>.</p>
-
- <p>Okay, that was still rather easy. However, to make things more
- complicated, Microsoft in its infinite wisdom decided that a property set
- shalt be broken into <strong>sections</strong>. Each section holds a bunch
- of properties. But since that's still not complicated enough: A section
- can optionally have a dictionary that maps property IDs to property
- names - we'll explain later what that means.</p>
-
- <p>So the procedure to get to the properties is as follows:</p>
-
- <ol>
- <li>Use the <code>PropertySetFactory</code> to create a
- <code>PropertySet</code> from an input stream. You can try this with any
- input stream: You'll either <code>PropertySet</code> instance or an
- exception is thrown.</li>
-
- <li>Call the <code>PropertySet</code>'s method <code>getSections()</code>
- to get a list of sections contained in the property set. Each section is
- an instance of the <code>Section</code> class.</li>
-
- <li>Each section has a format ID. The format ID of the first section in a
- property set determines the property set's type. For example, the first
- (and only) section of the SummaryInformation property set has a format ID
- of <code>F29F85E0-4FF9-1068-AB-91-08-00-2B-27-B3-D9</code>. You can
- get the format ID with <code>Section.getFormatID()</code>.</li>
-
- <li>The properties contained in a <code>Section</code> can be retrieved
- with <code>Section.getProperties()</code>. The result is an array of
- <code>Property</code> instances.</li>
-
- <li>A property has a name, a type, and a value. The <code>Property</code>
- class has methods to retrieve them.</li>
- </ol>
+ <section title="Overview">
+ <p>Now comes the real hardcode stuff. As mentioned above,
+ <code>SummaryInformation</code> and
+ <code>DocumentSummaryInformation</code> are just special cases of the
+ general concept of a property set. This concept says that a
+ <strong>property set</strong> consists of properties and that each
+ <strong>property</strong> is an entity with an <strong>ID</strong>, a
+ <strong>type</strong>, and a <strong>value</strong>.</p>
+
+ <p>Okay, that was still rather easy. However, to make things more
+ complicated, Microsoft in its infinite wisdom decided that a property set
+ shalt be broken into one or more <strong>sections</strong>. Each section
+ holds a bunch of properties. But since that's still not complicated
+ enough, a section may have an optional <strong>dictionary</strong> that
+ maps property IDs to <strong>property names</strong> - we'll explain
+ later what that means.</p>
+
+ <p>The procedure to get to the properties is the following:</p>
+
+ <ol>
+ <li>Use the <strong><code>PropertySetFactory</code></strong> class to
+ create a <code>PropertySet</code> object from a property set stream. If
+ you don't know whether an input stream is a property set stream, just
+ try to call <code>PropertySetFactory.create(java.io.InputStream)</code>:
+ You'll either get a <code>PropertySet</code> instance returned or an
+ exception is thrown.</li>
+
+ <li>Call the <code>PropertySet</code>'s method <code>getSections()</code>
+ to get the sections contained in the property set. Each section is
+ an instance of the <code>Section</code> class.</li>
+
+ <li>Each section has a format ID. The format ID of the first section in a
+ property set determines the property set's type. For example, the first
+ (and only) section of the SummaryInformation property set has a format
+ ID of <code>F29F85E0-4FF9-1068-AB-91-08-00-2B-27-B3-D9</code>. You can
+ get the format ID with <code>Section.getFormatID()</code>.</li>
+
+ <li>The properties contained in a <code>Section</code> can be retrieved
+ with <code>Section.getProperties()</code>. The result is an array of
+ <code>Property</code> instances.</li>
+
+ <li>A property has a name, a type, and a value. The <code>Property</code>
+ class has methods to retrieve them.</li>
+ </ol>
+ </section>
- <p>Let's have a look at a sample Java application that dumps all property
- set streams contained in a POI file system. The full source code of this
- program can be found as <em>ReadCustomPropertySets.java</em> in the
- <em>examples</em> area of the POI source code tree. Here are the key
- sections:</p>
+ <section title="A Sample Application">
+ <p>Let's have a look at a sample Java application that dumps all property
+ set streams contained in a POI file system. The full source code of this
+ program can be found as <em>ReadCustomPropertySets.java</em> in the
+ <em>examples</em> area of the POI source code tree. Here are the key
+ sections:</p>
<source>import java.io.*;
import java.util.*;
@@ -381,8 +386,10 @@
<p>The <code>POIFSReader</code> is set up in a way that the listener
<code>MyPOIFSReaderListener</code> is called on every file in the POI file
system.</p>
+ </section>
- <p>The listener class tries to create a <code>PropertySet</code> from each
+ <section title="The Property Set">
+ <p>The listener class tries to create a <code>PropertySet</code> from each
stream using the <code>PropertySetFactory.create()</code> method:</p>
<source>static class MyPOIFSReaderListener implements POIFSReaderListener
@@ -420,8 +427,10 @@
other types of exceptions cause the program to terminate by throwing a
runtime exception. If all went well, we can print the name of the property
set stream.</p>
+ </section>
- <p>The next step is to print the number of sections followed by the
+ <section title="The Sections">
+ <p>The next step is to print the number of sections followed by the
sections themselves:</p>
<source>/* Print the number of sections: */
@@ -439,18 +448,18 @@
// See below for the complete loop body.
}</source>
- <p>The <code>PropertySet</code>'s method <code>getSectionCount()</code>
- returns the number of sections.</p>
+ <p>The <code>PropertySet</code>'s method <code>getSectionCount()</code>
+ returns the number of sections.</p>
- <p>To retrieve the sections, use the <code>getSections()</code>
- method. This method returns a <code>java.util.List</code> containing
- instances of the <code>Section</code> class in their proper order.</p>
-
- <p>The sample code shows a loop that retrieves the <code>Section</code>
- objects one by one and prints some information about each one. Here is the
- complete body of the loop:</p>
+ <p>To retrieve the sections, use the <code>getSections()</code>
+ method. This method returns a <code>java.util.List</code> containing
+ instances of the <code>Section</code> class in their proper order.</p>
+
+ <p>The sample code shows a loop that retrieves the <code>Section</code>
+ objects one by one and prints some information about each one. Here is
+ the complete body of the loop:</p>
- <source>/* Print a single section: */
+ <source>/* Print a single section: */
Section sec = (Section) i.next();
out(" Section " + nr++ + ":");
String s = hex(sec.getFormatID().getBytes());
@@ -473,49 +482,53 @@
out(" Property ID: " + id + ", type: " + type +
", value: " + value);
}</source>
+ </section>
- <p>The first method called on the <code>Section</code> instance is
- <code>getFormatID()</code>. As explained above, the format ID of the first
- section in a property set determines the type of the property set. Its
- type is <code>ClassID</code> which is essentially a sequence of 16
- bytes. A real application using its own type of a custom property set
- should have defined a unique format ID and, when reading a property set
- stream, should check the format ID is equal to that unique format ID. The
- sample program just prints the format ID it finds in a section:</p>
+ <section title="The Section's Format ID">
+ <p>The first method called on the <code>Section</code> instance is
+ <code>getFormatID()</code>. As explained above, the format ID of the
+ first section in a property set determines the type of the property
+ set. Its type is <code>ClassID</code> which is essentially a sequence of
+ 16 bytes. A real application using its own type of a custom property set
+ should have defined a unique format ID and, when reading a property set
+ stream, should check the format ID is equal to that unique format ID. The
+ sample program just prints the format ID it finds in a section:</p>
- <source>String s = hex(sec.getFormatID().getBytes());
+ <source>String s = hex(sec.getFormatID().getBytes());
s = s.substring(0, s.length() - 1);
out(" Format ID: " + s);</source>
- <p>As you can see, the <code>getFormatID()</code> method returns a
- <code>ClassID</code> object. An array containing the bytes can be
- retrieved with <code>ClassID.getBytes()</code>. In order to get a nicely
- formatted printout, the sample program uses the <code>hex()</code> helper
- method which in turn uses the POI utility class <code>HexDump</code> in
- the <code>org.apache.poi.util</code> package. Another helper method is
- <code>out()</code> which just saves typing
- <code>System.out.println()</code>.</p>
-
- <p>Before getting the properties, it is possible to find out how many
- properties are available in the section via the
- <code>Section.getPropertyCount()</code>. The sample application uses this
- method to print the number of properties to the standard output:</p>
+ <p>As you can see, the <code>getFormatID()</code> method returns a
+ <code>ClassID</code> object. An array containing the bytes can be
+ retrieved with <code>ClassID.getBytes()</code>. In order to get a nicely
+ formatted printout, the sample program uses the <code>hex()</code> helper
+ method which in turn uses the POI utility class <code>HexDump</code> in
+ the <code>org.apache.poi.util</code> package. Another helper method is
+ <code>out()</code> which just saves typing
+ <code>System.out.println()</code>.</p>
+ </section>
+
+ <section title="The Properties">
+ <p>Before getting the properties, it is possible to find out how many
+ properties are available in the section via the
+ <code>Section.getPropertyCount()</code>. The sample application uses this
+ method to print the number of properties to the standard output:</p>
- <source>int propertyCount = sec.getPropertyCount();
+ <source>int propertyCount = sec.getPropertyCount();
out(" No. of properties: " + propertyCount);</source>
- <p>Now its time to get to the properties themselves. You can retrieve a
- section's properties with the method
- <code>Section.getProperties()</code>:</p>
-
- <source>Property[] properties = sec.getProperties();</source>
-
- <p>As you can see the result is an array of <code>Property</code>
- objects. This class has three methods to retrieve a property's ID, its
- type, and its value. The following code snippet shows how to call
- them:</p>
+ <p>Now its time to get to the properties themselves. You can retrieve a
+ section's properties with the method
+ <code>Section.getProperties()</code>:</p>
+
+ <source>Property[] properties = sec.getProperties();</source>
+
+ <p>As you can see the result is an array of <code>Property</code>
+ objects. This class has three methods to retrieve a property's ID, its
+ type, and its value. The following code snippet shows how to call
+ them:</p>
- <source>for (int i2 = 0; i2 < properties.length; i2++)
+ <source>for (int i2 = 0; i2 < properties.length; i2++)
{
/* Print a single property: */
Property p = properties[i2];
@@ -525,15 +538,17 @@
out(" Property ID: " + id + ", type: " + type +
", value: " + value);
}</source>
+ </section>
- <p>The output of the sample program might look like the following. It shows
- the summary information and the document summary information property sets
- of a Microsoft Word document. However, unlike the first and second section
- of this HOW-TO the application does not have any code which is specific to
- the <code>SummaryInformation</code> and
- <code>DocumentSummaryInformation</code> classes.</p>
+ <section title="Sample Output">
+ <p>The output of the sample program might look like the following. It
+ shows the summary information and the document summary information
+ property sets of a Microsoft Word document. However, unlike the first and
+ second section of this HOW-TO the application does not have any code
+ which is specific to the <code>SummaryInformation</code> and
+ <code>DocumentSummaryInformation</code> classes.</p>
- <source>Property set stream "/SummaryInformation":
+ <source>Property set stream "/SummaryInformation":
No. of sections: 1
Section 0:
Format ID: 00000000 F2 9F 85 E0 4F F9 10 68 AB 91 08 00 2B 27 B3 D9 ....O..h....+'..
@@ -588,29 +603,247 @@
No property set stream: "/CompObj"
No property set stream: "/1Table"</source>
- <p>There are some interestion items to note:</p>
+ <p>There are some interestion items to note:</p>
- <ul>
- <li>The first property set (summary information) consists of a single
+ <ul>
+ <li>The first property set (summary information) consists of a single
section, the second property set (document summary information) consists
of two sections.</li>
- <li>Each section type (identified by its format ID) has its own domain of
- property ID. For example, in the second property set the properties with
- ID 2 have different meanings in the two section. By the way, the format
- IDs of these sections are <strong>not</strong> equal, but you have to
- look hard to find the difference.</li>
+ <li>Each section type (identified by its format ID) has its own domain of
+ property ID. For example, in the second property set the properties with
+ ID 2 have different meanings in the two section. By the way, the format
+ IDs of these sections are <strong>not</strong> equal, but you have to
+ look hard to find the difference.</li>
+
+ <li>The properties are not in any particular order in the section,
+ although they slightly tend to be sorted by their IDs.</li>
+ </ul>
+ </section>
- <li>The properties are not in any particular order in the section,
- although they slightly tend to be sorted by their IDs.</li>
- </ul>
+ <section title="Property IDs">
+ <p>Properties in the same section are distinguished by their IDs. This is
+ similar to variables in a programming language like Java, which are
+ distinguished by their names. But unlike variable names, property IDs are
+ simple integral numbers. There is another similarity, however. Just like
+ a Java variable has a certain scope (e.g. a member variables in a class),
+ a property ID also has its scope of validity: the section.</p>
+
+ <p>Two property IDs in sections with different section format IDs
+ don't have the same meaning even though their IDs might be equal. For
+ example, ID 4 in the first (and only) section of a summary
+ information property set denotes the document's author, while ID 4 in the
+ first section of the document summary information property set means the
+ document's byte count. The sample output above does not show a property
+ with an ID of 4 in the first section of the document summary information
+ property set. That means that the document does not have a byte
+ count. However, there is a property with an ID of 4 in the
+ <em>second</em> section: This is a user-defined property ID - we'll get
+ to that topic in a minute.</p>
+
+ <p>So, how can you find out what the meaning of a certain property ID in
+ the summary information and the document summary information property set
+ is? The standard property sets as such don't have any hints about the
+ <strong>meanings of their property IDs</strong>. For example, the summary
+ information property set does not tell you that the property ID 4 stands
+ for the document's author. This is external knowledge. Microsoft defined
+ standard meanings for some of the property IDs in the summary information
+ and the document summary information property sets. As a help to the Java
+ and POI programmer, the class <code>PropertyIDMap</code> in the
+ <code>org.apache.poi.hpsf.wellknown</code> package defines constants
+ for the "well-known" property IDs. For example, there is the
+ definition</p>
+
+ <source>public final static int PID_AUTHOR = 4;</source>
+
+ <p>These definitions allow you to use symbolic names instead of
+ numbers.</p>
+
+ <p>In order to provide support for the other way, too, - i.e. to map
+ property IDs to property names - the class <code>PropertyIDMap</code>
+ defines two static methods:
+ <code>getSummaryInformationProperties()</code> and
+ <code>getDocumentSummaryInformationProperties()</code>. Both return
+ <code>java.util.Map</code> objects which map property IDs to
+ strings. Such a string gives a hint about the property's meaning. For
+ example,
+ <code>PropertyIDMap.getSummaryInformationProperties().get(4)</code>
+ returns the string "PID_AUTHOR". An application could use this string as
+ a key to a localized string which is displayed to the user, e.g. "Author"
+ in English or "Verfasser" in German. HPSF might provide such
+ language-dependend ("localized") mappings in a later release.</p>
+
+ <p>Usually you won't have to deal with those two maps. Instead you should
+ call the <code>Section.getPIDString(int)</code> method. It returns the
+ string associated with the specified property ID in the context of the
+ <code>Section</code> object.</p>
+
+ <p>Above you learned that property IDs have a meaning in the scope of a
+ section only. However, there are two exceptions to the rule: The property
+ IDs 0 and 1 have a fixed meaning in <strong>all</strong> sections:</p>
+
+ <table>
+ <tr>
+ <th>Property ID</th>
+ <th>Meaning</th>
+ </tr>
+
+ <tr>
+ <td>0</td>
+ <td>The property's value is a <strong>dictionary</strong>, i.e. a
+ mapping from property IDs to strings.</td>
+ </tr>
+
+ <tr>
+ <td>1</td>
+ <td>The property's value is the number of a <strong>codepage</strong>,
+ i.e. a mapping from character codes to characters. All strings in the
+ section containing this property must be interpreted using this
+ codepage. Typical property values are 1252 (8-bit "western" characters)
+ or 1200 (16-bit Unicode characters).</td>
+ </tr>
+ </table>
+ </section>
+
+ <section title="Property types">
+ <p>A property is nothing without its value. It is stored in a property set
+ stream as a sequence of bytes. You must know the property's
+ <strong>type</strong> in order to properly interpret those bytes and
+ reasonably handle the value. A property's type is one of the so-called
+ Microsoft-defined <strong>"variant types"</strong>. When you call
+ <code>Property.getType()</code> you'll get a <code>long</code> value
+ which denoting the property's variant type. The class
+ <code>Variant</code> in the <code>org.apache.poi.hpsf</code> package
+ holds most of those <code>long</code> values as named constants. For
+ example, the constant <code>VT_I4 = 3</code> means a signed integer value
+ of four bytes. Examples of other types are <code>VT_LPSTR = 30</code>
+ meaning a null-terminated string of 8-bit characters, <code>VT_LPWSTR =
+ 31</code> which means a null-terminated Unicode string, or <code>VT_BOOL
+ = 11</code> denoting a boolean value.</p>
+
+ <p>In most cases you won't need a property's type because HPSF does all
+ the work for you.</p>
+ </section>
+
+ <section title="Property values">
+ <p>When an application wants to retrieve a property's value and calls
+ <code>Property.getValue()</code>, HPSF has to interpret the bytes making
+ out the value according to the property's type. The type determines how
+ many bytes the value consists of and what
+ to do with them. For example, if the type is <code>VT_I4</code>, HPSF
+ knows that the value is four bytes long and that these bytes
+ comprise a signed integer value in the little-endian format. This is
+ quite different from e.g. a type of <code>VT_LPWSTR</code>. In this case
+ HPSF has to scan the value bytes for a Unicode null character and collect
+ everything from the beginning to that null character as a Unicode
+ string.</p>
+
+ <p>The good new is that HPSF does another job for you, too: It maps the
+ variant type to an adequate Java type.</p>
+
+ <table>
+ <tr>
+ <th>Variant type:</th>
+ <th>Java type:</th>
+ </tr>
+
+ <tr>
+ <td>VT_I2</td>
+ <td>java.lang.Integer</td>
+ </tr>
+
+ <tr>
+ <td>VT_I4</td>
+ <td>java.lang.Long</td>
+ </tr>
+
+ <tr>
+ <td>VT_FILETIME</td>
+ <td>java.util.Date</td>
+ </tr>
+
+ <tr>
+ <td>VT_LPSTR</td>
+ <td>String</td>
+ </tr>
+
+ <tr>
+ <td>VT_LPWSTR</td>
+ <td>String</td>
+ </tr>
+
+ <tr>
+ <td>VT_CF</td>
+ <td>byte[]</td>
+ </tr>
+
+ <tr>
+ <td>VT_BOOL</td>
+ <td>java.lang.Boolean</td>
+ </tr>
+
+ </table>
+
+ <p>The bad news is that there are still a couple of variant types HPSF
+ does not yet support. If it encounters one of these types it
+ returns the property's value as a byte array and leaves it to be
+ interpreted by the application.</p>
+
+ <p>An application retrieves a property's value by calling the
+ <code>Property.getValue()</code> method. This method's return type is the
+ abstract <code>Object</code> class. The <code>getValue()</code> method
+ looks up the property's variant type, reads the property's value bytes,
+ creates an instance of an adequate Java type, assigns it the property's
+ value and returns it. Primitive types like <code>int</code> or
+ <code>long</code> will be returned as the corresponding class,
+ e.g. <code>Integer</code> or <code>Long</code>.</p>
+ </section>
- <note>[To be continued.]</note>
- <note>A last note: There are still some aspects of HSPF left which are not
- documented in this HOW-TO. You should dig into the Javadoc API
- documentation to learn further details. Since you struggled through this
- document up to this point, you are well prepared.</note>
+ <section title="Dictionaries">
+ <p>The property with ID 0 has a very special meaning: It is a
+ <strong>dictionary</strong> mapping property IDs to property names. We
+ have seen already that the meanings of standard properties in the
+ summary information and the document summary information property sets
+ have been defined by Microsoft. The advantage is that the labels of
+ properties like "Author" or "Title" don't have to be stored in the
+ property set. However, a user can define custom fields in, say, Microsoft
+ Word. For each field the user has to specify a name, a type, and a
+ value.</p>
+
+ <p>The names of the custom-defined fields (i.e. the property names) are
+ stored in the document summary information second section's
+ <strong>dictionary</strong>. The dictionary is a map which associates
+ property IDs with property names.</p>
+
+ <p>The method <code>Section.getPIDString(int)</code> not only returns with
+ the well-known property names of the summary information and document
+ summary information property sets, but with self-defined properties,
+ too. It should also work with self-defined properties in self-defined
+ sections.</p>
+ </section>
+
+ <section title="Codepage support">
+ <fixme author="Rainer Klute">Improve codepage support!</fixme>
+
+ <p>The property with ID 1 holds the number of the codepage which was used
+ to encode the strings in this section. The present HPSF codepage support
+ is still very limited: When reading property value strings, HPSF
+ distinguishes between 16-bit characters and 8-bit characters. 16-bit
+ characters should be Unicode characters and thus be okay. 8-bit
+ characters are interpreted according to the platform's default character
+ set. This is fine as long as the document being read has been written on
+ a platform with the same default character set. However, if you receive a
+ document from another region of the world and want to process it with
+ HPSF you are in trouble - unless the creator used Unicode, of course.</p>
+ </section>
+
+ <section title="Further Reading">
+ <p>There are still some aspects of HSPF left which are not covered by this
+ HOW-TO. You should dig into the Javadoc API documentation to learn
+ further details. Since you've struggled through this document up to this
+ point, you are well prepared.</p>
+ </section>
</section>
</section>
</body>
1.11 +11 -8 jakarta-poi/src/documentation/xdocs/hpsf/todo.xml
Index: todo.xml
===================================================================
RCS file: /home/cvs/jakarta-poi/src/documentation/xdocs/hpsf/todo.xml,v
retrieving revision 1.10
retrieving revision 1.11
diff -u -r1.10 -r1.11
--- todo.xml 2 Feb 2003 20:28:45 -0000 1.10
+++ todo.xml 5 Feb 2003 19:33:27 -0000 1.11
@@ -16,22 +16,25 @@
<ol>
<li>
- <p>Add writing capability for property sets.</p>
+ <p>Add writing capability for property sets. Presently property sets can
+ be read only.</p>
</li>
<li>
- <p>Add codepage support.</p>
- </li>
- <li>
- <p>Add Unicode support.</p>
+ <p>Add codepage support: Presently the bytes making out the string in a
+ property's value are interpreted using the platform's default character
+ set.</p>
</li>
<li>
<p>Add resource bundles to
<code>org.apache.poi.hpsf.wellknown</code> to ease
- localizations.</p>
+ localizations. This would be useful for mapping standard property IDs to
+ localized strings. Example: The property ID 4 could be mapped to "Author"
+ in English or "Verfasser" in German.</p>
</li>
<li>
<p>Implement reading functionality for those property types that are not
- yet supported (other than byte arrays).</p>
+ yet supported. HPSF should return proper Java types instead of just byte
+ arrays.</p>
</li>
<li>
<p>Add WMF to <code>java.awt.Image</code> example code in <link
1.2 +6 -1 jakarta-poi/src/java/org/apache/poi/hpsf/TypeReader.java
Index: TypeReader.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/TypeReader.java,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- TypeReader.java 10 Dec 2002 06:15:19 -0000 1.1
+++ TypeReader.java 5 Feb 2003 19:33:27 -0000 1.2
@@ -137,6 +137,11 @@
* Read a byte string. In Java it is represented as a
* String object. The 0x00 bytes at the end must be
* stripped.
+ *
+ * FIXME: Reading an 8-bit string should pay attention
+ * to the codepage. Currently the byte making out the
+ * property's value are interpreted according to the
+ * platform's default character set.
*/
final int first = offset + LittleEndian.INT_SIZE;
long last = first + LittleEndian.getUInt(src, offset) - 1;
1.7 +5 -3 jakarta-poi/src/java/org/apache/poi/hpsf/wellknown/PropertyIDMap.java
Index: PropertyIDMap.java
===================================================================
RCS file: /home/cvs/jakarta-poi/src/java/org/apache/poi/hpsf/wellknown/PropertyIDMap.java,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -r1.6 -r1.7
--- PropertyIDMap.java 10 Dec 2002 06:15:19 -0000 1.6
+++ PropertyIDMap.java 5 Feb 2003 19:33:27 -0000 1.7
@@ -79,7 +79,8 @@
{
/*
- * The following definitions are for the Summary Information.
+ * The following definitions are for property IDs in the first
+ * (and only) section of the Summary Information property set.
*/
public final static int PID_TITLE = 2;
public final static int PID_SUBJECT = 3;
@@ -103,7 +104,8 @@
/*
- * The following definitions are for the Document Summary Information.
+ * The following definitions are for property IDs in the first
+ * section of the Document Summary Information property set.
*/
/**