You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2015/06/23 17:07:00 UTC

[jira] [Updated] (TIKA-1651) Add mime (and parsing?) for Microsoft Chart object

     [ https://issues.apache.org/jira/browse/TIKA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison updated TIKA-1651:
------------------------------
    Summary: Add mime (and parsing?) for Microsoft Chart object  (was: Excel files embedded in ppt and xls seem to have a high rate of exceptions in govdocs1)

> Add mime (and parsing?) for Microsoft Chart object
> --------------------------------------------------
>
>                 Key: TIKA-1651
>                 URL: https://issues.apache.org/jira/browse/TIKA-1651
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>         Attachments: 11.xls, 428996.ppt, embedded_xls_stack_traces.csv
>
>
> I haven't had a chance to look into this at all, but I wanted to open an issue to track this.  With recently modified tika eval dev code that captures exceptions from embedded documents, there are ~30k exceptions in govdocs1 for xls files embedded in ppt and xls files. 
> There's a chance that something went wrong with the eval code, and there's a chance that these files are mis-typed, but we should take a look.
> Example files to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)