You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Drew Cimino <dr...@gmail.com> on 2011/07/15 20:46:48 UTC

XWPFDocument parsing

Hi all. I've been working on a project that makes heavy use of XWPFDocument,
XWPFTable, and XWPFParagraph from the POI libraries. I'm having some issues
because I want to parse the entire document, paragraphs and tables, in the
order they exist in the document. The original plan was to convert the
document to an XWPFDocument, and then use getBodyElements() to iterate
through the pragraphs and tables, parsing them in order.

However, getBodyElements() seems to be returning a list of elements that all
span the entire length of the document, and contain only the paragraphs or
only the tables. I'm not sure if the numerous elements are versions tracked
by word or something, but they seem like multiple instances of the same
document. Can anyone explain what's going on, or what other method I should
be using to do what I want? Thanks very much.

Regards,
Drew

Re: XWPFDocument parsing

Posted by Nick Burch <ni...@alfresco.com>.
On Fri, 15 Jul 2011, Drew Cimino wrote:
> However, getBodyElements() seems to be returning a list of elements that 
> all span the entire length of the document, and contain only the 
> paragraphs or only the tables. I'm not sure if the numerous elements are 
> versions tracked by word or something, but they seem like multiple 
> instances of the same document. Can anyone explain what's going on, or 
> what other method I should be using to do what I want? Thanks very much.

Your best bet probably is to unzip the .docx file, and look at the XML of 
the document. Does your text occur multiple times in there? How does the 
current version differ from the past versions? Is POI giving you 
everything that's in the file, or is there a bug?

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org