You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Roland <ro...@onlinehome.de> on 2004/04/19 16:09:44 UTC

POI / HWPF and textmining.org

Hi,

I am currently working on a JAVA application development project
in which I have to convert MS Word documents to ASCII. By searching
the web, I came across the POI / HWPF API and the Textmining.org
text extraction library that seem to provide solutions for this task.

I installed the Textmining.org libraries V 0.4 and tried out the
example that comes along with this distribution; it seems to
rely upon POI and, in particular, HWPF. However, the POI
distribution poi-src-2.5-final-20040302.tar does not contain HWPF.
>From where to take an appropriate version of HWPF? 

Thanks a lot,

Roland




---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Re: POI / HWPF and textmining.org

Posted by Ryan Ackley <sa...@cfl.rr.com>.
Its in there look in the src\java\scratchpad directory

-Ryan

----- Original Message ----- 
From: "Roland" <ro...@onlinehome.de>
To: <po...@jakarta.apache.org>
Sent: Monday, April 19, 2004 10:09 AM
Subject: POI / HWPF and textmining.org


> Hi,
> 
> I am currently working on a JAVA application development project
> in which I have to convert MS Word documents to ASCII. By searching
> the web, I came across the POI / HWPF API and the Textmining.org
> text extraction library that seem to provide solutions for this task.
> 
> I installed the Textmining.org libraries V 0.4 and tried out the
> example that comes along with this distribution; it seems to
> rely upon POI and, in particular, HWPF. However, the POI
> distribution poi-src-2.5-final-20040302.tar does not contain HWPF.
> From where to take an appropriate version of HWPF? 
> 
> Thanks a lot,
> 
> Roland
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-user-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Re: POI / HWPF and textmining.org

Posted by Stephane James Vaucher <va...@cirano.qc.ca>.
Look at tm-extractors-0.4.jar:

[vauchers@localhost vauchers]$ 
jar tvf /tmp/textmining/tm-extractors-0.4.jar | grep hwpf

  3367 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/CHPBinTable.class
  2100 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/CHPFormattedDiskPage.class
  1281 Thu Mar 04 00:21:56 EST 2004 org/apache/poi/hwpf/model/CHPX.class
   822 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/CachedPropertyNode.class
  1490 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/ComplexFileTable.class
   732 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/FormattedDiskPage.class
   407 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/GenericPropertyNode.class
   121 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/HDFType.class
  1454 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/PieceDescriptor.class
  1865 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/PlexOfCps.class
  1103 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/PropertyNode.class
  3934 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/StyleDescription.class
  4305 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/StyleSheet.class
  1683 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/TextPiece.class
  3003 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/TextPieceTable.class
   507 Thu Mar 04 00:21:56 EST 2004 org/apache/poi/hwpf/model/UPX.class
   731 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/io/HWPFFileSystem.class
   590 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/io/HWPFOutputStream.class
 12716 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/types/CHPAbstractType.class
 12925 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/model/types/PAPAbstractType.class
  6494 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/sprm/CharacterSprmUncompressor.class
  7074 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/sprm/ParagraphSprmUncompressor.class
  1814 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/sprm/SprmBuffer.class
   597 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/sprm/SprmIterator.class
  2214 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/sprm/SprmOperation.class
   306 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/sprm/SprmUncompressor.class
  1488 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/usermodel/BorderCode.class
  7123 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/usermodel/CharacterProperties.class
  1288 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/usermodel/DateAndTime.class
   704 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/usermodel/DropCapSpecifier.class
  1122 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/usermodel/LineSpacingDescriptor.class
  9110 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/usermodel/ParagraphProperties.class
  1124 Thu Mar 04 00:21:56 EST 2004 
org/apache/poi/hwpf/usermodel/ShadingDescriptor.class

sv

On Mon, 19 Apr 2004, Roland wrote:

> Hi,
> 
> I am currently working on a JAVA application development project
> in which I have to convert MS Word documents to ASCII. By searching
> the web, I came across the POI / HWPF API and the Textmining.org
> text extraction library that seem to provide solutions for this task.
> 
> I installed the Textmining.org libraries V 0.4 and tried out the
> example that comes along with this distribution; it seems to
> rely upon POI and, in particular, HWPF. However, the POI
> distribution poi-src-2.5-final-20040302.tar does not contain HWPF.
> >From where to take an appropriate version of HWPF? 
> 
> Thanks a lot,
> 
> Roland
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org