You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Alex Ott (Commented) (JIRA)" <ji...@apache.org> on 2011/11/25 15:39:39 UTC
[jira] [Commented] (TIKA-789) Microsoft Project (MPP) basic support
[ https://issues.apache.org/jira/browse/TIKA-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157175#comment-13157175 ]
Alex Ott commented on TIKA-789:
-------------------------------
Detection should be pretty straightforward. MS Project v8 should have /Props stream, while v9 and higher should have /Props9 stream
> Microsoft Project (MPP) basic support
> -------------------------------------
>
> Key: TIKA-789
> URL: https://issues.apache.org/jira/browse/TIKA-789
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Affects Versions: 1.0
> Reporter: Nick Burch
> Assignee: Nick Burch
>
> The Microsoft Project file format (MPP) could fairly easily be better supported by Tika. Gaps to fill are:
> * Correct mimetype definition (it's OLE2 based)
> * OLE2 detection for MPP
> * Common OLE2 metadata extraction
> For fuller support (such as text contents), we'd probably want a parser which used MPXJ. However, as MPXJ is LGPL, it'd need to be an external 3rd party parser. (MPXJ is based on top of POI, but it's under a more copyleft license. POI itself doesn't have MPP support)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira