You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@poi.apache.org by se...@apache.org on 2011/08/09 08:47:01 UTC
svn commit: r1155227 -
/poi/trunk/src/documentation/content/xdocs/hwpf/index.xml
Author: sergey
Date: Tue Aug 9 06:47:01 2011
New Revision: 1155227
URL: http://svn.apache.org/viewvc?rev=1155227&view=rev
Log:
more HWPF documentation
Modified:
poi/trunk/src/documentation/content/xdocs/hwpf/index.xml
Modified: poi/trunk/src/documentation/content/xdocs/hwpf/index.xml
URL: http://svn.apache.org/viewvc/poi/trunk/src/documentation/content/xdocs/hwpf/index.xml?rev=1155227&r1=1155226&r2=1155227&view=diff
==============================================================================
--- poi/trunk/src/documentation/content/xdocs/hwpf/index.xml (original)
+++ poi/trunk/src/documentation/content/xdocs/hwpf/index.xml Tue Aug 9 06:47:01 2011
@@ -48,15 +48,63 @@
either have a recent SVN checkout, or a recent SVN nightly build
(including the scratchpad jar!)</p>
- <p>Source in the
- <em>org.apache.poi.hwpf.model</em> tree is the old legacy code refactored
- into an object model. Source code in the
- <em>org.apache.poi.hwpf.extractor</em> tree is a wrapper of this to
- facilitate easy extraction of interesting things (eg the Text).
- Source code in the <em>org.apache.poi.hdf</em> tree is the old legacy
- code.
- </p>
+ <p>
+ Source code in the
+ <em>org.apache.poi.hdf</em>
+ tree is the old legacy code. Source in the
+ <em>org.apache.poi.hwpf.model</em>
+ tree is the old legacy code refactored into an new object model. Those packages contains
+ Java representation of internal Word format structure. This code is "internal", it shall not
+ be used by your code. Because of backward-compatibility some API still has references to
+ those packages. They are subject to be deprecated and removed. Code from
+ <em>org.apache.poi.hwpf.usermodel</em>
+ package is actual public and user-friendly (as much as possible) API to access document
+ parts. Source code in the
+ <em>org.apache.poi.hwpf.extractor</em>
+ tree is a wrapper of this to facilitate easy extraction of interesting things (eg the Text),
+ and
+ <em>org.apache.poi.hwpf.converter</em>
+ package contains Word-to-HTML and Word-to-FO converters (latest can be used to generate PDF
+ from Word files when using with
+ <a href="http://xmlgraphics.apache.org/fop/">Apache FOP</a>
+ ). Also there is a small file-structure-dumping utility in
+ <em>org.apache.poi.hwpf.dev</em>
+ package, primally for developing purposes.
+ </p>
+
+ <p>
+ The main entry point to HWPF is HWPFDocument. Currently it has a lot of references both to
+ internal interfaces (
+ <em>org.apache.poi.hwpf.model</em>
+ package) and public API (
+ <em>org.apache.poi.hwpf.usermodel</em>
+ ) package. It is possible that it will be split into two different interfaces (like WordFile
+ and WordDocument) in later versions.
+ </p>
+
+ <p>Word document can be considered as very long single text buffer. HWPF API provides "pointers"
+ to document parts, like sections, paragraphs and character runs. Usually user will iterates
+ over main document part sections, paragraphs from sections and character runs from
+ paragraph. Each such interface is a pointer to document text subrange along with additional
+ properties (and they all extends same Range parent class). There is additional Range
+ implementations like Table, TableRow, TableCell, etc. Some structures like Bookmark or Field
+ can also provide subranges pointers.
+ </p>
+
+ <p>Changing file content usually requires a lot of synchronized changes in those structures like
+ updating property boundaries, position handlers, etc. Because of that HWPF API shall be
+ considered as not thread safe. In addition, there is a "one pointer" rule for changing
+ content. It means you should not use two different Range instances at one time. More
+ precisely, if you are changing file content using some range pointer, all other range
+ pointers except parents' ones become invalid. For example if you obtain overall range (1),
+ paragraph range (2) from overall range and character run range (3) from paragraph range and
+ change text of paragraph, character run range is now invalid and should not be used, but
+ overall range pointer still valid. Each time you obtaining range (pointer) new instance is
+ created. It means if you obtained two range pointers and changed document text using first
+ range pointer, second one became invalid.
+ </p>
+ </section>
<section>
<title>XWPF Patches Required!</title>
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@poi.apache.org
For additional commands, e-mail: commits-help@poi.apache.org