You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Roland <ro...@onlinehome.de> on 2004/04/19 16:09:44 UTC
POI / HWPF and textmining.org
Hi,
I am currently working on a JAVA application development project
in which I have to convert MS Word documents to ASCII. By searching
the web, I came across the POI / HWPF API and the Textmining.org
text extraction library that seem to provide solutions for this task.
I installed the Textmining.org libraries V 0.4 and tried out the
example that comes along with this distribution; it seems to
rely upon POI and, in particular, HWPF. However, the POI
distribution poi-src-2.5-final-20040302.tar does not contain HWPF.
>From where to take an appropriate version of HWPF?
Thanks a lot,
Roland
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org
Re: POI / HWPF and textmining.org
Posted by Ryan Ackley <sa...@cfl.rr.com>.
Its in there look in the src\java\scratchpad directory
-Ryan
----- Original Message -----
From: "Roland" <ro...@onlinehome.de>
To: <po...@jakarta.apache.org>
Sent: Monday, April 19, 2004 10:09 AM
Subject: POI / HWPF and textmining.org
> Hi,
>
> I am currently working on a JAVA application development project
> in which I have to convert MS Word documents to ASCII. By searching
> the web, I came across the POI / HWPF API and the Textmining.org
> text extraction library that seem to provide solutions for this task.
>
> I installed the Textmining.org libraries V 0.4 and tried out the
> example that comes along with this distribution; it seems to
> rely upon POI and, in particular, HWPF. However, the POI
> distribution poi-src-2.5-final-20040302.tar does not contain HWPF.
> From where to take an appropriate version of HWPF?
>
> Thanks a lot,
>
> Roland
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org
Re: POI / HWPF and textmining.org
Posted by Stephane James Vaucher <va...@cirano.qc.ca>.
Look at tm-extractors-0.4.jar:
[vauchers@localhost vauchers]$
jar tvf /tmp/textmining/tm-extractors-0.4.jar | grep hwpf
3367 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/CHPBinTable.class
2100 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/CHPFormattedDiskPage.class
1281 Thu Mar 04 00:21:56 EST 2004 org/apache/poi/hwpf/model/CHPX.class
822 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/CachedPropertyNode.class
1490 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/ComplexFileTable.class
732 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/FormattedDiskPage.class
407 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/GenericPropertyNode.class
121 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/HDFType.class
1454 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/PieceDescriptor.class
1865 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/PlexOfCps.class
1103 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/PropertyNode.class
3934 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/StyleDescription.class
4305 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/StyleSheet.class
1683 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/TextPiece.class
3003 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/TextPieceTable.class
507 Thu Mar 04 00:21:56 EST 2004 org/apache/poi/hwpf/model/UPX.class
731 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/io/HWPFFileSystem.class
590 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/io/HWPFOutputStream.class
12716 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/types/CHPAbstractType.class
12925 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/model/types/PAPAbstractType.class
6494 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/sprm/CharacterSprmUncompressor.class
7074 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/sprm/ParagraphSprmUncompressor.class
1814 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/sprm/SprmBuffer.class
597 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/sprm/SprmIterator.class
2214 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/sprm/SprmOperation.class
306 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/sprm/SprmUncompressor.class
1488 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/usermodel/BorderCode.class
7123 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/usermodel/CharacterProperties.class
1288 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/usermodel/DateAndTime.class
704 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/usermodel/DropCapSpecifier.class
1122 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/usermodel/LineSpacingDescriptor.class
9110 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/usermodel/ParagraphProperties.class
1124 Thu Mar 04 00:21:56 EST 2004
org/apache/poi/hwpf/usermodel/ShadingDescriptor.class
sv
On Mon, 19 Apr 2004, Roland wrote:
> Hi,
>
> I am currently working on a JAVA application development project
> in which I have to convert MS Word documents to ASCII. By searching
> the web, I came across the POI / HWPF API and the Textmining.org
> text extraction library that seem to provide solutions for this task.
>
> I installed the Textmining.org libraries V 0.4 and tried out the
> example that comes along with this distribution; it seems to
> rely upon POI and, in particular, HWPF. However, the POI
> distribution poi-src-2.5-final-20040302.tar does not contain HWPF.
> >From where to take an appropriate version of HWPF?
>
> Thanks a lot,
>
> Roland
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org