You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Brendon Allen <br...@mac.com> on 2003/06/03 02:03:43 UTC

Text Stripping from Doc

Hey all,

Sorry to bother you with this seemingly simple question but is there and
easy way with the HDF stuff to read in a doc and have it strip everything
but the text?  IN other words I want to grab just the raw text out of a doc.
I looked through the java docs in pre 2.0 and cannot find an easy way to do
this.  In the meantime I wrote a little stripper code to go in and do most
of it but if there is a DTD (doing this on Word X for Mac docs) it is hard
to do.  Anyway thanks in advance and thanks for all your hard work.

Brendon