You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ju...@apache.org on 2008/09/14 20:59:16 UTC
svn commit: r695265 - /incubator/tika/trunk/src/site/apt/formats.apt
Author: jukka
Date: Sun Sep 14 11:59:16 2008
New Revision: 695265
URL: http://svn.apache.org/viewvc?rev=695265&view=rev
Log:
TIKA-157: List all the document formats supported by Tika
Minor fixes.
Modified:
incubator/tika/trunk/src/site/apt/formats.apt
Modified: incubator/tika/trunk/src/site/apt/formats.apt
URL: http://svn.apache.org/viewvc/incubator/tika/trunk/src/site/apt/formats.apt?rev=695265&r1=695264&r2=695265&view=diff
==============================================================================
--- incubator/tika/trunk/src/site/apt/formats.apt (original)
+++ incubator/tika/trunk/src/site/apt/formats.apt Sun Sep 14 11:59:16 2008
@@ -47,7 +47,7 @@
{{{http://poi.apache.org/}Apache POI}} to parse OLE2-based Microsoft
Word documents. Support for Microsoft Word was added in Tika 0.1.
- The Word parser in Tika simply the POI
+ The Word parser in Tika simply uses the POI
{{{http://poi.apache.org/apidocs/org/apache/poi/hwpf/extractor/WordExtractor.html}WordExtractor}}
class to extract text paragraphs from Word documents. Support for more
complex content structures is not yet implemented; see
@@ -98,7 +98,7 @@
PowerPoint presentations. Support for Microsoft PowerPoint was added
in Tika 0.1.
- The PowerPoint parser in Tika simply the POI
+ The PowerPoint parser in Tika simply uses the POI
{{{http://poi.apache.org/apidocs/org/apache/poi/hslf/extractor/PowerPointExtractor.html}PowerPointExtractor}}
class to extract all text as a single paragraph from a PowerPoint document.
Support for more complex content structures is not yet implemented; see
@@ -121,7 +121,7 @@
{{{http://poi.apache.org/}Apache POI}} to parse OLE2-based Microsoft
Visio diagrams. Support for Microsoft Visio was added in Tika 0.2.
- The Visio parser in Tika simply the POI
+ The Visio parser in Tika simply uses the POI
{{{http://poi.apache.org/apidocs/org/apache/poi/hdgf/extractor/VisioTextExtractor.html}VisioExtractor}}
class to extract all text entries from Visio documents.
Support for more complex content structures is not yet implemented; see