You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ni...@apache.org on 2017/05/08 17:46:33 UTC
svn commit: r1794427 - /tika/site/src/site/apt/1.15/formats.apt
Author: nick
Date: Mon May 8 17:46:33 2017
New Revision: 1794427
URL: http://svn.apache.org/viewvc?rev=1794427&view=rev
Log:
Document support for additional Microsoft Office formats
Modified:
tika/site/src/site/apt/1.15/formats.apt
Modified: tika/site/src/site/apt/1.15/formats.apt
URL: http://svn.apache.org/viewvc/tika/site/src/site/apt/1.15/formats.apt?rev=1794427&r1=1794426&r2=1794427&view=diff
==============================================================================
--- tika/site/src/site/apt/1.15/formats.apt (original)
+++ tika/site/src/site/apt/1.15/formats.apt Mon May 8 17:46:33 2017
@@ -67,6 +67,16 @@ Supported Document Formats
Old, pre-OLE2 Excel files (Excel 2, 3 and 4) are handled by the
{{{./api/org/apache/tika/parser/microsoft/OldExcelParser.html}OldExcelParser}}.
+ The older, pre-OOXML pure-XML, office file formats are handled by
+ {{{./api/org/apache/tika/parser/microsoft/xml/SpreadsheetMLParser.html}SpreadsheetMLParser}},
+ {{{./api/org/apache/tika/parser/microsoft/xml/WordMLParser.html}WordMLParser}}
+ and
+ {{{./api/org/apache/tika/parser/microsoft/ooxml/xwpf/ml2006/Word2006MLParser.html}Word2006MLParser}}.
+
+ Temporary Office lock files (owner files) are supported for basic metadata
+ extraction by
+ {{{./api/org/apache/tika/parser/microsoft/MSOwnerFileParser.html}MSOwnerFileParser}}.
+
* {OpenDocument Format}
The OpenDocument format (ODF) is used most notably as the default format