You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Alex Ott (Commented) (JIRA)" <ji...@apache.org> on 2011/11/25 15:39:39 UTC

[jira] [Commented] (TIKA-789) Microsoft Project (MPP) basic support

    [ https://issues.apache.org/jira/browse/TIKA-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157175#comment-13157175 ] 

Alex Ott commented on TIKA-789:
-------------------------------

Detection should be pretty straightforward. MS Project v8 should have /Props stream, while v9 and higher should have /Props9 stream
                
> Microsoft Project (MPP) basic support
> -------------------------------------
>
>                 Key: TIKA-789
>                 URL: https://issues.apache.org/jira/browse/TIKA-789
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.0
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>
> The Microsoft Project file format (MPP) could fairly easily be better supported by Tika. Gaps to fill are:
>  * Correct mimetype definition (it's OLE2 based)
>  * OLE2 detection for MPP
>  * Common OLE2 metadata extraction
> For fuller support (such as text contents), we'd probably want a parser which used MPXJ. However, as MPXJ is LGPL, it'd need to be an external 3rd party parser. (MPXJ is based on top of POI, but it's under a more copyleft license. POI itself doesn't have MPP support)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira