You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ni...@apache.org on 2017/05/08 17:41:31 UTC

svn commit: r1794421 - /tika/site/src/site/apt/1.15/formats.apt

Author: nick
Date: Mon May  8 17:41:31 2017
New Revision: 1794421

URL: http://svn.apache.org/viewvc?rev=1794421&view=rev
Log:
Document support for TSD, WMF and WordPerfect

Modified:
    tika/site/src/site/apt/1.15/formats.apt

Modified: tika/site/src/site/apt/1.15/formats.apt
URL: http://svn.apache.org/viewvc/tika/site/src/site/apt/1.15/formats.apt?rev=1794421&r1=1794420&r2=1794421&view=diff
==============================================================================
--- tika/site/src/site/apt/1.15/formats.apt (original)
+++ tika/site/src/site/apt/1.15/formats.apt Mon May  8 17:41:31 2017
@@ -82,6 +82,14 @@ Supported Document Formats
    {{{./api/org/apache/tika/parser/iwork/IWorkPackageParser.html}IWorkPackageParser}}
    class, which extracts text and metadata.
 
+* {WordPerfect document formats}
+
+   The Corel WordPerfect Office Suite formats are supported by
+   {{{./api/org/apache/tika/parser/wordperfect/WordPerfectParser.html}WordPerfectParser}},
+   supporting WordPerfect WP6+ files, and
+   {{{./api/org/apache/tika/parser/wordperfect/QuattroProParser.html}QuattroProParser}},
+   supporting QuattroPro QPW v9+ files.
+
 * {Portable Document Format}
 
    The {{{./api/org/apache/tika/parser/pdf/PDFParser.html}PDFParser}} class
@@ -185,10 +193,13 @@ Supported Document Formats
    The {{{./api/org/apache/tika/parser/image/ICNSParser.html}ICNSParser}} 
    class extracts simple metadata from the Apple ICNS icon image format.
 
-   When extracting from images, it is also possible to chain in Tesseract via
-   the {{{./api/org/apache/tika/parser/ocr/TesseractOCRParser.html}TesseractOCRParser}}
+   When extracting from images, it is also possible to chain in Tesseract, via
+   the {{{./api/org/apache/tika/parser/ocr/TesseractOCRParser.html}TesseractOCRParser}},
    to have OCR performed on the contents of the image.
 
+   The {{{./api/org/apache/tika/parser/microsoft/WMFParser.html}WMFParser}}
+   class extracts simple text from Microsoft WMF drawings.
+
 * {Video formats}
 
    Tika supports the Flash video format using a simple parsing algorithm 
@@ -309,6 +320,10 @@ Supported Document Formats
    parse the contents of PKCS7 signed messages, but doesn't include any information from
    the outer PKCS7 wrapper.
 
+   The {{{./api/org/apache/tika/parser/crypto/TSDParser.html}TSDParser}} class
+   processes metadata from Time Stamped Data Envelope files, as well as exposing the
+   contents stored within the TSD wrapper.
+
 * {Database formats}
 
    The {{{./api/org/apache/tika/parser/jdbc/SQLite3Parser.html}SQLite3Parser}} is able to