You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2016/06/13 12:41:21 UTC

[jira] [Updated] (TIKA-1358) Add support for newer iWork file formats

     [ https://issues.apache.org/jira/browse/TIKA-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison updated TIKA-1358:
------------------------------
    Attachment: pages.txt
                connors_20040127.txt
                budget.txt

I got evernote's iwana lparser working.  I'm attaching the output of that on the three test files submitted on TIKA-1966.

Looks like we need to add code to extract info from tables in budget.numbers

> Add support for newer iWork file formats
> ----------------------------------------
>
>                 Key: TIKA-1358
>                 URL: https://issues.apache.org/jira/browse/TIKA-1358
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 1.5
>            Reporter: Jelle Kastelein
>              Labels: new-parser, newbie
>         Attachments: budget.txt, connors_20040127.txt, iwork13-testdocs-zips.zip, iwork13-testfiles-2014-11.zip, pages.txt
>
>
> IWork 2013 uses a revised file format which replaces the xml files that hold the content by .iwa files (a binary format). This file format is becoming increasingly relevant as more and more people are using apple products. However, it does not appear to work with the current IWorkPackageParser (tested with several of the example .pages files one can get from the iCloud). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)