You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ni...@apache.org on 2014/09/02 14:06:02 UTC

svn commit: r1621968 - in /tika/site/src/site/apt: 1.6/formats.apt 1.7/formats.apt

Author: nick
Date: Tue Sep  2 12:06:01 2014
New Revision: 1621968

URL: http://svn.apache.org/r1621968
Log:
Update the supported formats documentation

Modified:
    tika/site/src/site/apt/1.6/formats.apt
    tika/site/src/site/apt/1.7/formats.apt

Modified: tika/site/src/site/apt/1.6/formats.apt
URL: http://svn.apache.org/viewvc/tika/site/src/site/apt/1.6/formats.apt?rev=1621968&r1=1621967&r2=1621968&view=diff
==============================================================================
--- tika/site/src/site/apt/1.6/formats.apt (original)
+++ tika/site/src/site/apt/1.6/formats.apt Tue Sep  2 12:06:01 2014
@@ -86,6 +86,9 @@ Supported Document Formats
    supports the Electronic Publication Format (EPUB) used for many digital
    books.
 
+   The {{{./api/org/apache/tika/parser/xml/FictionBookParser.html}FictionBookParser}} class
+   supports the xml-based Fiction Book publishing format.
+
 * {Rich Text Format}
 
    The {{{./api/org/apache/tika/parser/rtf/RTFParser.html}RTFParser}} class
@@ -115,6 +118,9 @@ Supported Document Formats
    The {{{./api/org/apache/tika/parser/feed/FeedParser.html}FeedParser}} class
    supports the RSS and Atom feed syndication formats.
 
+   The {{{./api/org/apache/tika/parser/iptc/IptcAnpaParser.html}IptcAnpaParser}} class
+   supports the IPTC ANPA News Wire feed format.
+
 * {Help formats}
 
    The {{{./api/org/apache/tika/parser/chm/ChmParser.html}ChmParser}} class
@@ -188,6 +194,10 @@ Supported Document Formats
    extract email messages from the mbox format used by many email archives
    and Unix-style mailboxes.
 
+   The {{{./api/org/apache/tika/parser/mail/RFC822Parser.html}RFC822Parser}} can
+   process single email messages in the RFC 822 format used by many email clients
+   in their archives / exports.
+
    The {{{./api/org/apache/tika/parser/mbox/PSTParser.html}PSDParser}} can
    extract email messages from the Microsoft Outlook PST email format.
 
@@ -203,6 +213,17 @@ Supported Document Formats
    The {{{./api/org/apache/tika/parser/font/AdobeFontMetricParser.html}AdobeFontMetricParser}} 
    class does something similar for Adobe Font Metrics files.
 
+* {Scientific formats}
+
+   The {{{./api/org/apache/tika/parser/hdf/HDFParser.html}HDFParser}}
+   is able to extract attribute metadata from the HDF scientific file format.
+
+   The {{{./api/org/apache/tika/parser/netcdf/NetCDFParser.html}NetCDFParser}}
+   is able to extract attribute metadata from the NetCDF scientific file format.
+
+   The {{{./api/org/apache/tika/parser/mat/MatParser.html}MatParser}}
+   is able to extract attribute metadata from the Matlab scientific file format.
+
 * {Executable programs and libraries}
 
    The {{{./api/org/apache/tika/parser/executable/ExecutableParser.html}ExecutableParser}} can
@@ -210,6 +231,12 @@ Supported Document Formats
    of executable formats and libraries, such as Windows Executables and Linux / BSD 
    programs and libraries.
 
+* {Crypto formats}
+
+   The {{{./api/org/apache/tika/parser/crypto/Pkcs7Parser.html}Pkcs7Parser}} is able to
+   parse the contents of PKCS7 signed messages, but doesn't include any information from
+   the outer PKCS7 wrapper.
+
 Full list of supported formats:
 
    * org.apache.tika.parser.asm.{{{./api/org/apache/tika/parser/asm/ClassParser}ClassParser}}
@@ -350,6 +377,10 @@ Full list of supported formats:
 
       * message/rfc822
 
+   * org.apache.tika.parser.mat.{{{./api/org/apache/tika/parser/mat/MatParser}MatParser}}
+
+      * application/x-matlab-data
+
    * org.apache.tika.parser.mbox.{{{./api/org/apache/tika/parser/mbox/MboxParser}MboxParser}}
 
       * application/mbox

Modified: tika/site/src/site/apt/1.7/formats.apt
URL: http://svn.apache.org/viewvc/tika/site/src/site/apt/1.7/formats.apt?rev=1621968&r1=1621967&r2=1621968&view=diff
==============================================================================
--- tika/site/src/site/apt/1.7/formats.apt (original)
+++ tika/site/src/site/apt/1.7/formats.apt Tue Sep  2 12:06:01 2014
@@ -86,6 +86,9 @@ Supported Document Formats
    supports the Electronic Publication Format (EPUB) used for many digital
    books.
 
+   The {{{./api/org/apache/tika/parser/xml/FictionBookParser.html}FictionBookParser}} class
+   supports the xml-based Fiction Book publishing format.
+
 * {Rich Text Format}
 
    The {{{./api/org/apache/tika/parser/rtf/RTFParser.html}RTFParser}} class
@@ -115,6 +118,9 @@ Supported Document Formats
    The {{{./api/org/apache/tika/parser/feed/FeedParser.html}FeedParser}} class
    supports the RSS and Atom feed syndication formats.
 
+   The {{{./api/org/apache/tika/parser/iptc/IptcAnpaParser.html}IptcAnpaParser}} class
+   supports the IPTC ANPA News Wire feed format.
+
 * {Help formats}
 
    The {{{./api/org/apache/tika/parser/chm/ChmParser.html}ChmParser}} class
@@ -188,6 +194,10 @@ Supported Document Formats
    extract email messages from the mbox format used by many email archives
    and Unix-style mailboxes.
 
+   The {{{./api/org/apache/tika/parser/mail/RFC822Parser.html}RFC822Parser}} can
+   process single email messages in the RFC 822 format used by many email clients
+   in their archives / exports.
+
    The {{{./api/org/apache/tika/parser/mbox/PSTParser.html}PSDParser}} can
    extract email messages from the Microsoft Outlook PST email format.
 
@@ -203,6 +213,17 @@ Supported Document Formats
    The {{{./api/org/apache/tika/parser/font/AdobeFontMetricParser.html}AdobeFontMetricParser}} 
    class does something similar for Adobe Font Metrics files.
 
+* {Scientific formats}
+
+   The {{{./api/org/apache/tika/parser/hdf/HDFParser.html}HDFParser}}
+   is able to extract attribute metadata from the HDF scientific file format.
+
+   The {{{./api/org/apache/tika/parser/netcdf/NetCDFParser.html}NetCDFParser}}
+   is able to extract attribute metadata from the NetCDF scientific file format.
+
+   The {{{./api/org/apache/tika/parser/mat/MatParser.html}MatParser}}
+   is able to extract attribute metadata from the Matlab scientific file format.
+
 * {Executable programs and libraries}
 
    The {{{./api/org/apache/tika/parser/executable/ExecutableParser.html}ExecutableParser}} can
@@ -210,6 +231,12 @@ Supported Document Formats
    of executable formats and libraries, such as Windows Executables and Linux / BSD 
    programs and libraries.
 
+* {Crypto formats}
+
+   The {{{./api/org/apache/tika/parser/crypto/Pkcs7Parser.html}Pkcs7Parser}} is able to
+   parse the contents of PKCS7 signed messages, but doesn't include any information from
+   the outer PKCS7 wrapper.
+
 Full list of supported formats:
 
    TODO Populate this at release time